NetFlix: Predictive Analytics & Recommendations

The NetFlix Contest begins today — the contest to improve the Netflix recommendation engine, with a $1MM grand prize and a $50K progress awards. To win the grand prize, the contestant must show an improvement of 10% better than the current Cinematch algorithm. Here’s their explanation:

The Netflix Prize seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on their movie preferences. Improve it enough and you win one (or more) Prizes. Winning the Netflix Prize improves our ability to connect people to the movies they love.

Here’s the best part — they’ve provided training data set containing 500,000 past and current recommendations. This should be fun — I’ve downloaded the 650 MB file and will begin playing with it tonight. I’ll get to use some python and revisit the good ‘ol graduate school days of machine learning and neural nets.

Netflix, Inc.

More explanation of the Rules here:

Netflix is all about connecting people to the movies they love. To help customers find those movies, we’ve developed our world-class movie recommendation system: CinematchSM. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. We use those predictions to make personal movie recommendations based on each customer’s unique tastes. And while Cinematch is doing pretty well, it can always be made better.

Now there are a lot of interesting alternative approaches to how Cinematch works that we haven’t tried. Some are described in the literature, some aren’t. We’re curious whether any of these can beat Cinematch by making better predictions. Because, frankly, if there is a much better approach it could make a big difference to our customers and our business.

So, we thought we’d make a contest out of finding the answer. It’s “easy” really. We provide you with a lot of anonymous rating data, and a prediction accuracy bar that is 10% better than what Cinematch can do on the same training data set. (Accuracy is a measurement of how closely predicted ratings of movies match subsequent actual ratings.) If you develop a system that we judge most beats that bar on the qualifying test set we provide, you get serious money and the bragging rights. But (and you knew there would be a catch, right?) only if you share your method with us and describe to the world how you did it and why it works.

Serious money demands a serious bar. We suspect the 10% improvement is pretty tough, but we also think there is a good chance it can be achieved. It may take months; it might take years. So to keep things interesting, in addition to the Grand Prize, we’re also offering a $50,000 Progress Prize each year the contest runs. It goes to the team whose system we judge shows the most improvement over the previous year’s best accuracy bar on the same qualifying test set. No improvement, no prize. And like the Grand Prize, to win you’ll need to share your method with us and describe it for the world.


Short URL: http://bit.ly/2dZ8lN

Share This Post:



  • Digg
  • Facebook
  • del.icio.us
  • Suggest to Techmeme via Twitter
  • StumbleUpon
  • LinkedIn
  • email

2-pizza teams (10)
3 C's (3)
5S (38)
A3 Report (9)
adoption (7)
agile/software (59)
ajax (4)
amazon (53)
apple (3)
apple iphone (7)
axiom (3)
Aza Raskin (9)
backcountry.com (2)
berlin (1)
bill gates (1)
bill marriott (1)
blog tag (1)
book reviews (4)
bullwhip effect (5)
business (394)
business plans (3)
busm361 (13)
BzzAgent (12)
call center and queueing (11)
car buying (2)
Carbonite (1)
change management (5)
chicago (1)
click fraud (1)
click-to-ship (21)
clocky (2)
colin powell (2)
community (2)
company interviews (18)
company interviews (6)
complexity (32)
costs (8)
culture (7)
customer experience (10)
customer obsession (52)
customer recovery function (1)
customer segmentation (8)
customer service (17)
design thinking (14)
digg (4)
drum-buffer-rope (38)
dublin (1)
dynamic systems (24)
eBay (6)
economics (3)
efficiency (4)
ethnography (29)
family (18)
featuritis (15)
flexibility (1)
forecasting (2)
four performance dimensions (2)
Fun With The 2×2 Matrix (1)
game theory (7)
Gemba (67)
genchi genbutsu (68)
general (135)
germany (1)
google (15)
heijunka (65)
holidays (1)
hoshin kanri (1)
how to be a human (1)
IDEO (2)
image uploading (1)
iphone (5)
ishikawa (69)
IT at Toyota (67)
just-in-time (4)
kaizen (4)
kanban (46)
law of instinct (1)
Leadership (43)
lean (165)
Lean Consumption Maps (98)
learning curve (1)
licketyship (1)
mark cuban (1)
martin luther king (1)
mary poppendieck (1)
metrics (73)
microsoft (6)
milton friedman (1)
moving average (1)
muda (68)
nba fines (1)
net promoter score (nps) (1)
obeya (39)
Off-Topic (1)
onstar (1)
operations (108)
pageviews (3)
pareto principle (39)
patent (1)
peanut butter manifesto (2)
philosophy (3)
Poka-Yoke (6)
poppendieck (3)
powerpoint sucks (2)
private equity (4)
process measures (6)
product development (20)
productivity (4)
quality (41)
quasimodal design (1)
queueing theory (41)
Raffle (1)
rational choice (2)
regression analysis (18)
respect for people (6)
root cause analysis (60)
sarah+palin (2)
seth godin (1)
simplicity principle (10)
six sigma (128)
snowboarding (2)
social media (3)
spam (1)
statistical process control (46)
strategy (46)
suburban (1)
supply chain (24)
takt time (8)
teaching (2)
team size (9)
technology (104)
the beer distribution game (1)
the profit tree (7)
The Visual Factory (11)
theory of constraints (41)
time (2)
timeline (3)
tony+hsieh (11)
toyota (75)
travel (1)
trump bankruptcy (1)
turnaround (5)
twitter (8)
uspto (1)
utah deal flow (2)
variation (69)
venture capital (1)
Visual Management (11)
waste (59)
website traffic (2)
Wing Chun (2)
wisdom of crowds (1)
wisdom teeth (1)
word-of-mouth marketing (18)
yahoo (2)
zappos.com (12)
zero defects (3)

WP Cumulus Flash tag cloud by Roy Tanck and Luke Morton requires Flash Player 9 or better.


If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Comments

This looks like a lot of fun. The chances of winning are probably pretty small unless you’re deep in the Machine Learning/AI field (look at who came up with their system in the FAQ, then consider who your competition will be), but it can’t hurt to try and would be a lot of fun.

I’m very impressed by the way the contest it layed out. Good evaluation techniques, sizeable data set, results back to the world, etc. Kudos to netflix.

Good luck! I love Nerflix and would love to see them get even better. I always wondered how their recommendations worked.

Good Luck to all

That sounds like a lot of fun.

One of the first things Netflix needs to do is allow users to rate things with half or even quarter stars. If people didn’t have to round (I do all the time), the input would be a lot better, and so would the output.

[...] Well now it looks like Netflix has placed a bounty on an algorithm to make the Recommendation Engine even better.  My friend Peter Abilla indicates he might give it a try. He’s also posted about the rules and the $1 million dollar prize. I wish I had more time because this is one problem I would love to help solve. (I’m pretty nerdy like that.) Plus it would be cool to have a million bucks.  :) One thing I would recommend (kind of unrelated to the algorithm) is to allow users to rate movies with half, or even quarter starts.  More precise input means better output, and I know I’ve had to round many times when I thought a movie deserved 3 and a half stars, etc. [...]

Leave a comment

(required)

(required)


Additional comments powered by BackType