Completely agree. I am constantly baffled at the bullshit Netflix suggests me.
I just went to the frontpage of my Netflix and...:
- "My list" recommends 3 series where I have alread watched the last episode
- "Only on Netflix" recommends another two series, I have already wathed.
- A section displayed is "Watch together for older kids" - I dont have kids, never watched any kids stuff on my account.
- "Documentaries" contains six suggestions, 3 of which I have already watched on netflix".
- I am suggested several shows, which are good - but I have already watched outside Netflix earlier - and there is no way to tell Netflix (none that I know of, anyway)
9 out of 10 times when I go to Netflix, I intend to continue watching a series - but Netflix makes me scroll past SEVEN sections of recommendations to get to "Continue watching.." before showing me the series I have watched 1-2 episodes of most days for the past two weeks.
Maybe they are just too busy making sure all new series are woke-i-fied to care about how this simple stuff works?
"My list" is not algorithmically generated; it's a list of shows you have chosen to add to it. It would be nice if have you the option to remove a show from it when you finished it but very annoying if it did it automatically behind your back. Although clearly the real problem here is that "my list" isn't a clear enough description to show you manually control it (but I don't have a better suggestion).
Although clearly the real problem here is that "my list" isn't a clear enough description to show you manually control it (but I don't have a better suggestion).
That's the hard part about designing software for anyone - there is no average user & you can't really make assumptions about their understanding of your system. After all, I think the intention of "My list" is clear and fairly obvious, so that makes 3 people with 3 different ideas about the feature.
>- "Only on Netflix" recommends another two series, I have already wathed.
I think the only on Netflix part works as part of positive reinforcement - it always shows me stuff it knows I watched all the way through (hence I must have liked it) mixed with things I haven't watched or haven't watched in a long time.
Thus when I see that section it is reminding me they have stuff I liked that I can only get there so please don't ever change account, and here is some more of that stuff only we got - try it out!
> Maybe they are just too busy making sure all new series are woke-i-fied to care about how this simple stuff works?
Or maybe it's not as simple as you think? "Baffled" is a strong word. It's the same "baffled" that Amazon can't find all the fake reviews. The same "baffled" that Facebook can't delete all your photos, everywhere, when you close an account.
Maybe things at scale are more challenging? These are some of the most valuable companies in the world. I'm sure they're happy to throw buckets of money at you if you can solve these "simple" problems for them.
The Netflix catalogue, and the movie rental/streaming ecosystem in general, was completely different when the AI challenge was going on. It was a DVD postal service. Your ratings mattered, and almost any movie you wanted was available. No one was bored scrolling Netflix trying to decide how to kill two hours before bed.
They are clearly not using the ratings based automatic recommendations any more. It's not even relevant given the limited and generally low-quality content available. It's just about keeping enough people paying.
The current "Netflix Optimization Problem" is how to spend the minimum amount on content but still keep you subscribing.
The Netflix prize also put Netflix on the map in terms of being a company that solves hard problems. We are still talking about it today and you'd better believe it has inspired talented people to work at Netflix; they could easily blow $1M on recruiting people and have much less to show for it. It's why Netflix is part of "FAANG".
Reed Hastings genius is that he led Netflix through a number of transitions between fundamentally different businesses: he built a strong brand with the DVD-based business without permission from the studios, transferred that brand to streaming when the studios saw it as "free money". By the time the studios understood what it was worth Netflix decided it was cheaper to buy than rent. (just as a consequence of having more customers)
The new frontier is that they use your engagement data not just to "suggest" the next movie but to design movies that will keep you engaged.
It's a little bit scary with these services that are "all you can eat" games for $10 a month because you're giving up "voting with your dollar" but creating a trail of engagement that will be fed back into satisfying your narcissism. Taking screenshots and videos of games seems fun and harmless at first but somebody knows I had a big crush on Nikola and Chiara from Valkyria Chronicles 4.
> The new frontier is that they use your engagement data not just to "suggest" the next movie but to design movies that will keep you engaged.
Personally, the opposite has been happening in my recommendations. I've been switching over to Netflix less and less as its library thins, and the replacements seem not as compelling... can't even remember the last time I watched a full movie or TV show season on Netflix.
Guess I'm not the target demographic :( but it's not like I personally pay 'the Netflix bill' anyway.
Toddlers seems to enjoy the procedurally generated content though; maybe they're mistaking toddlers randomly bashing their screen for audience engagement?
"Appealing to narcissism" is dead easy on one level (avoids all sorts of problems that you could encounter with people otherwise), and very hard on another level.
If you are always "present" and engaged then the target is going to do most of the work themselves. If "the lights are on and nobody is home" when you try to take a step forward, you really take 10 steps back.
>They are clearly not using the ratings based automatic recommendations any more. It's not even relevant given the limited and generally low-quality content available. It's just about keeping enough people paying.
For those of us old enough to remember, it's not much different than going to Blockbuster. All of the new releases were along the walls with lots of copies to support the high demand. That's where everyone started when entering the store. If you found what you wanted, you grabbed a copy and left. In the middle of the store, the shelves were full of stuff you'd never heard of with one, maybe two, copies available. Both of those copies were covered in dust. You'd see people doing the physical version of endlessly scrolling to ultimately settle on "something" just to not be scrolling any more.
Really, the only difference now is at least you don't have drive somewhere to do the scrolling. I'd also say that there's at least the advantage of being able to do it in your PJs, but Blockbuster (any video rental place really) was the first public place that I noticed it became acceptable to not have to get dressed to visit.
The difference now is that, for 2 or 3 dollars, you can individually rent and stream most movies instantly. No need to pay a subscription that requires you to scroll through a small low-quality subset on a irritating interface.
Speaking specifically of the US market (I don't know where you are), the list of available titles for transactional rental at any time is a tiny subset of all movies that exist on digital due to windowing (licensing) restrictions. By far, most movies are not available for rental.
It's true. But most movies that people actually want to see, are.
I worked in a video store, back when there were such things, and can attest that the vast majority of people wanted the new thing and ignored the back catalog. It was my job to get them interested in the back catalog. I didn't do very well.
I joined Netflix because they had that back catalog available. But now that I'm old and grumpy, I've seen most of what I want to see in that back catalog, too. There's a ton of stuff in that category of "I'm sure it's great but I just don't want to work that hard". Also... most of that back catalog is crap, just under Sturgeon's Law.
Sadly, Netflix has figured that out, and gotten rid of most of its back catalog of DVDs. I hope the real film buffs have some other place to go get it.
That's fair. Maybe an order of magnitude difference between what's available on say Netflix streaming and what's available for individual rental, and maybe another order of magnitude for all movies? There sure are a lot of movies. I'm not sure where the Netflix DVD rental falls, especially if you account for movies that are technically available but with so few copies that it may take months to actually come to you.
> The Netflix catalogue, and the movie rental/streaming ecosystem in general, was completely different when the AI challenge was going on.
With Netflix producing its own content now, and with the cost of acquiring content rights much higher than it used to be (all major streaming platforms want to offer great content), I'm wondering how much the business imperative impacts the recommendations we get -> eg. Netflix giving priority to its own content over licensed shows/movies.
I suspect their algorithm is stuck in a local minimum, where it has proven to itself that, if a movie is presented to you hundreds of time, there will be a moment where you click, either inadvertently or because there is nothing else presented to you, and it counts as a validation of their engine. It is so optimized for this that it doesn’t try just presenting all movies anymore - which is a recurrent problem in A/B testing in general.
Yes, Netflix’ engine is the reason I left Netflix…
> Yes, Netflix’ engine is the reason I left Netflix
That seems rather odd to me. I can believe it was a final straw after other reasons like running out of content you particularly want (absolutely or in comparison with other services), but not it begin "the" reason.
I don't particularly pay attention to the recommendations on either Netflix or Amazon, instead picking up things I might like to try from external sources (friends & family, discussions or records in various media, having liked something or some part of it looking into what else the performers/writers/directors/other have done it are involved in now, sometimes the does own external advertising).
I feel that the recommendation systems are more optimised for people who use TV/movies as background noise rather than actively watching. That would explain re-recommending long running series that they have already watched, amongst other things people have mentioned in this discussion.
Maybe my behaviour is a vestige from the life of piracy back when content was less readily available otherwise (somehow region locked, or simply not available on local channels yet, etc, so I often couldn't get things I cared about more legitimately for many months, if ever, and back in the scheduled TV days things were often in at inconvenient times). I seek out what I want rather than waiting for it to be handed to me by the service(s).
Their catalogue of award winning, or good movies is very low.
I think the problem is the studios figured why rent them to Netflix when we can put up our own OnDemand service?
So, Netflix was at a conundrum, "How do we get material so people won't leave, and we can raise our prices?"
Overpaid Netflix MBA, "I got it! Let's throw money at directors, and writers. The directors will make make our movies because we pay well. The writers will churn out cliched filled scripts, and put every plot twist into everything they write. The average viewer isn't here to watch quality, we will give them a huge bat of lousy material. It will be like feeding the hogs with slop?"
Amazon Prime video seems to have a better library for those that appreciate good movies.
I did like The Twillight Zone, and Star Trek, when I had Netflix though.
(Years ago Netflix offered every episode of the Zone, and Trek. I got every silgle episode through the mail, and copied to dvd using---dvd something? They come in handy if xfinity goes out.
Oh yea, Xfinity was charging a family member $260 a month. I painfully got. down to 130 a month. She was loosing $1390 a year for probally a decade--with pretty much the same plan.
Xfinity should be broken up, or better regulated by authorities. I literally gave up trying to rectify the situation talking to three people who could barely speak english. The last Ecuadorean guy's english was so bad, I gave up, and just picked the cheapest plan on Comcast, and prayed the bill would go down.
A Xfinity employee told me the current business plan is just "milk" long term customers with confusing bills, and deals. They don't care about cord cutters. They know they will always have a large percent of people who will just pay because their isn't a real option in their county, and many older people are not computer savvy.
Hell I'm computer savvy, but their application interface is purposely confusing. I could sware they are randomenly switching prices over the phone, and through their application. I hope someone outs them if my hunch is right.
I believe it's much more in their interest to buy old TV shows. A good movie will keep you occupied for what, two hours? Seinfeld: almost 19 days of watch time. Friends: over 5 days. Community was barely ever popular before Netflix bought it, now there's plenty of people that enjoyed it for 2 days and 7 hours of watch time (its subreddit went from 266k on April 2020 to 482k right now).
I believe it's also in their interest to spread out stories that are realistically one-movie-long into 5-6 slightly drawn out 40-50 min episodes.
I just wish they fucking stick to them instead of cancelling them after like two seasons. Orange Is The New Black is the only original of theirs I know of that goes above two days of airtime.
I watch good movies many times. I keep them on a loop while studying, or working. In college, I always had an Oliver Stone film on. At the time the duality between good, and evil, was always on my mind.
I must have watched Wall Street, and Platoon, a few hundred times.
I won't even estimate how many times I have watched Hictchcock films.
And the number of times I have watched Giant, or Citizen Cane, is embarrassing.
I have old movies playing all the time. I don't actually watch them, but I find them comforting in a weird way, esoecially black, and white films. I think the old, good movies take a part of my brain away from reality? I listen to them while working,
Yes--how can I find Platoon comforting? At that point in my life, Charlie Sheen's character, and his father's, reminded me there are moral people still left. Maybe only in fantasy though?
I get what you are saying though. I have The Andy Griffith show on all the time.
(fun murky fact, I think true, fact about the Andy Griffith show. They didn't bother to copyright the episodes. For years people could sell copies of the show without copyright concerns. I think it's copy written now though.)
For me, the reason was partially the engine, partially their active work at confusing me.
I would keep seeing stuff I don’t want to watch, and they would keep switching out thumbnails to trick me into thinking it’s something I haven’t encountered. Both of those combined made browsing a chore, and I simply had no interest in using something that’s actively working against me (which is also the reason I went from being an active FB user to only using it for groups and messaging, so it’s not as if "actively working against me" is a Netflix exclusive)
A more minor reason is the lack of information displayed, but I could have handled that.
The most annoying thing about these streaming platforms is no IMDB rating being displayed. I have to type in every suggestion one by one which is annoying, or use a third party search engine. It'd be such a huge UX boost to simply include the rating, and better yet allow filtering by rating, I really don't get it. Perhaps they're trying to build network effects around their own rating system but it really detracts from the UX.
Maybe it's because filtering by rating would expose how shallow the catalogue is?
At least in my locale a good way to be reminded of this is searching for any movie you'd like to watch but isn't on the front page of netflix, typically they won't have it.
They made a change at some point (2017?) so as not to show movie ratings. I figured the reason was to give low rated movies a chance because I won't necessarily dislike a movie with IMDB rating 5.2/10. I've seen excellent movies with rating < 6.0 and lousy movies with rating > 7.5 It's just that my taste sometimes doesn't match with IMBD users.
Presumably poorly rated movies are a lot cheaper to license; and when you're licensing 2/10 rated movies, user satisfaction is a lot higher if you don't show the ratings.
And at least when I'm on it through my browser, I need to actually click on the title to see the rating, which makes it difficult to use it to filter out junk quickly.
In the end they will promote what they WANT to be popular, so our suggested titles will still be a mess of shit we already watched or disliked with a button.
Shows with 10 seasons you hated and disliked after 5 minutes will still be in our "continue watching" or suggested.
> In the end they will promote what they WANT to be popular
This is a hugely important point. I'm completely uninterested in most netflix originals, but I understand why they're going to continue to recommend them to me.
Which only goes to show that recommendation systems need to be separated from the vendors whose wares are being recommended. There's a huge conflict of interest here - so of course recommendations will be biased towards what the platform makes most money on. This applies not just to Netflix, but also YouTube.
As it is, these systems are just in-house banner ads in disguise, and as trustworthy as any other ad.
For a recommendation system to work in the interest of its users, its profits would have to be completely uncorrelated to the recommendations it gives. In a more sane world, platforms would have to accept such third-party recommendation systems as first-class citizens, to be used in lieu of whatever the platform offers.
What Netflix wants to be popular also tramples over what I am already watching. I find that the UI gets in the way of enjoying several programs that I am part way through by pushing something new. If they want to continue to make series and only get me to watch half of the episodes then the algorithm is spot on.
That said, Amazon Prime video is so much worse. It suggests that I would like to watch series 1 of a show when I have watched it and am part way through season 2.
Which makes you wonder why they even bother with suggestions in the first place, instead of doing the obvious thing: giving you a searchable list of all movies they have for your region.
Given that the former is a major engineering project, while the latter is a junior-level interview question, one has to assume they're trying to confuse their users on purpose.
iirc amazon video owns their own recommendations distinct from the rest of the company, where previously recommendations were generated from the standard retail systems. I'm convinced that they got worse when this change was made but this is purely anecdotal.
What Netflix is doing is hiding the fact that they have very little new content.
If they implemented all your suggestions, I'd have about 3 titles to watch, which I would like, but they're worried that I'd cancel the subscription.
Maybe I'm an outlier, but I don't care that much about new content, I care about good content. For quite a while I thought Netflix had a very small catalog, until I started deliberately going to specific categories that the recommendation engine never show me, then I realized that there is actually a fairy extensive back catalog of shows that are interesting, especially various shows in non-English languages.
The kind of content that works well for Netflix does not work so well for me. They introduce just enough content that works for me that I don’t cancel the subscription.
I still find it a marvel that despite the majority of content introduced being not for me, I have been able to watch something unseen approximately every night and have given up on so little. In that sense, it is better than TV – even if the impression I get from the new content I scroll past is that it appears to be following all the same trends that made TV less appealing to me.
The most irritating consequence of this content problem for me is that nothing remains in the same place. Is continue watching going to be one, two or even three down button presses tonight? The reward is occasionally something gets suggested that is worth watching that would have been found anyway within a few minutes of searching.
I would imagine what they’re doing makes sense for the majority, but it would be wonderfully nice if there were some kind of alternate "advanced" mode for people who understand the lack of content where all this suggested and popular stuff went away and the search filter improved.
I get the value of rewatching stuff, but some people take it to an extreme and I suspect these were the same people whose parents let them watch the same children’s show/movie over and over and over.
My understanding is that children rewatching things over and over is healthy for them, as it aids in things like language learning.
Anecdotally, in my own experience learning a second language, the online part of the course will play an audio or video clip, then play it again and have you answer questions on it, then do it again having you fill out missing words in a transcript, then play it again as you read the text along. I've found this extremely helpful to tune my ears.
Keep in mind the prize started before streaming did. You did not have the same data points they do today, as they were reading the thumbs up/down from DVDs shipping. Nor was the UI full of the same level of marketing as it is today, especially for all their original content, as none of that existed.
I think we all have personal preferences; for example:
If I watched a movie and gave it a thumbs down, don't recommend it
I sometimes click thumbs down by accident, or upon rewatching change my opinion.
If I watched a movie < 1 month ago, don't recommend it
I like to rewatch movies, sometimes more than once in a month.
If I browsed over a movie 50 times, read the info, and still didn't play it, stop recommending it.
I have a terrible habit of browsing Netflix and watching trailers right before falling asleep, and I'm sure I've done this to the same titles over and over again.
If I watched the last episode, remove the "new episodes" banner.
I'm not in front of Netflix right now; what happens if new episodes are added while you're watching the series?
The engine would be better not doing silly things just because of a "sometimes" fetish or bad faith.
It's also trivial to put all of the things you watched recently into their own subcategory in case you want to watch them again, which is in fact something that Netflix already does. It's called "Watch It Again". There's no reason to pollute recommendations for that.
> I sometimes click thumbs down by accident
The recommendation engine should be obeying your explicit actions, not trying to subvert them. Accidentally clicking thumbs down is an outlier action that is trivial for you to rectify on your own as soon as it happens.
> or upon rewatching change my opinion.
Intentionally rewatching a movie that you expressly disliked is an outlier position.
> I like to rewatch movies, sometimes more than once in a month.
Netflix already has a personal queue+favorites list called "My List" that you can add things to. If it has been less than a month since you last watched something, the reason you're watching it again so soon is because it's on your mind already and you don't need the recommendation engine for that.
Your reference is whatever crap Netflix currently uses. The algorithm they were using at the time of the prize was actually good to begin with. I think they may also have a smaller catalog of movies now. So that wouldn't help.
Yeah. I finally gave up on it last year not because of the catalogue, but because the turnaround got too slow. Individually renting movies to stream is better all-around.
It's still working out ok for us. We're doing the 2 DVDs at a time, and while I understand netflix always was suspected of "rate-limiting" people who had too much churn, we must not be anywhere near that since turnaround is still just a few of days. We're not impulse watchers so it's fine.
It's still tons cheaper than individual stream rentals, there's more device flexibility, a lot more choice, doesn't require blackbox DRM on the devices (they are probably going to force secure enclaves on linux for their DRM at some point once the kernel patches propagate).
All these rules determine what it shouldn't recommend. What rules would you use to determine what it should recommend?
Would you agree that these rules will quickly become unwieldy and a pain to maintain? And that these will be personalized to your taste but not to someone else's?
Wouldn't it be great if you didn't have to maintain those rules and if the system were tailored to every user? Congratulations, you've just realized you'd like to use ML/AI.
Based on the previous posts rules, for me it would be: Recommend anything else you have that's not excluded by those rules. And give me a "remove" like Amazon does and I'll be happy.
I don't actually want recommendations, I want a catalog with exclusions.
Eventually, you will be presented a blank page with a bit of text "You have consumed all of the internet. There is no more for you." You can read into that however dark you want to take it.
- Do you really get recommenditions for movies you already watched or down voted? I'm not aware of that happening to me.
It's not it the case that you use multiple different profiles within a single account?
I sometimes see that for shows we finished, yet people tend to rewatch episodes...
- Removing a movie based on "not payed X times" would remove all popular movies for all users in a multiple of X steps.
>- If I watched a movie < 1 month ago, don't recommend it
yeah some movies might be more likely to be rewatchable quicker by a larger amount of the population, and some people might be more likely to be able to rewatch movies they like more quickly than a month - and those people might have preferences that indicate their liking to rewatch more often.
as I said
>yeah some movies might be more likely to be rewatchable quicker by a larger amount of the population,
>Wait [ ] months before recommending again a show
I already watched.
does not handle that scenario, furthermore I am perhaps more pessimistic about user self knowledge than you are. I would probably make that box some high number but I am a sucker for rewatching some movies very often.
Netflix already gives you a way to add things to a personal list of things you want to watch soon and already has a category for things that you've already seen and enjoyed in case you want to watch them again. There's no need to pollute the general recommendations to scratch this itch.
Why would you need to be recommended a movie you already watched? You know about it and if you want to watch it again just search for it. Recommendations should be for discovery of new content.
Yes, I got annoyed, too, by the suggestions on Netflix.
However, what they do might actually work for them, i.e.
(1) The HN is probably not representative of their audience as a whole
(2) What they do now might be RoE accretive for them, but not so great even for a wider section of their customers
I wonder if there's space for a proper, quality recommendation engine for Netflix. In my books, it would do exactly what you described here, plus provide way for user to input persistent constraints (e.g. "Don't show me movies from 'Horror' genre") and queries (e.g. "genre: sci-fi, after: 2010").
How could it work with Netflix without their explicit support? For the movie database, I assume there's some data source somewhere that lists all the movies and series Netflix currently has in any given region. As for ingesting watching history, it could parse Content Interaction History exported from Netflix via their GDPR Subject Access Request flow. Sure, they have up to 30 days to process such request, but I'd happily accept a recommendation system I have to manually update every month, over the disaster Netflix has been offering to its users.
The great thing with these transparent parameters is that it can be user-adjusted to people's preferences. For example, maybe don't recommend if watched < x months ago.
I learned about AI by participating in the challenge, and ended up co-founding an AI company as a result.
Netflix got more than 1 million dollar in free advertising from it, and are still getting brand value out of it today. They implemented some of the algorithms, and probably got a 10X ROI through retention alone.
As mentioned in the Quora answer, they were also able to recruit top talent - and that's much harder to put an ROI figure on.
Similar story here. The company I work at founded when a couple of very bright people realised why the Netflix stuff won't generalise to many other areas, and sought to find something that did instead. So far, very successful!
I had the pleasure of working closely with Yehuda (one of the earlier prize winners) at Google where he works on TV and Movie recommendations (Search "what to watch" to see his team's work).
He's extremely intelligent and passionate about this space, and every time we spoke I felt like I was learning something new. You can listen to him give an in depth talk about the Netflix problem and solution here [1].
Do people feel that Netflix has a large enough catalog that its recommendation system really matters? The only useful feature it ever had for me was the ‘new this week’ category that seems to have been retired.
Really matters could be interpreted different ways. Expressed in startup-metric terms, it might be "% of users that watch something" or "time spent browsing before watching something." Any real improvement on such is probably important.
Think of it like Google's famous obsession with speed. Did returning search results 17ms faster really matter? It's hard to say for sure, but I suspect it did.
That said, I agree personally. I don't like Netflix' UI. I suspect you could hand code a browsing/ranking UI of similar value, from a casual users' perspective.
>> large enough catalog
I think this is a case where Netflix didn't end up where they expected the. I think they expected to have a vast catalogue... a "spotify of movies." It just didn't go that way.
You could also reverse the question. Does netflix have a big enough dataset to make a great recommendation system? I think this might be the more pertinent question. Google & FB have their vast ad-centric datasets. I suspect these could be used to make a recommendation engine that's a lot better.
They haven't really done this for youtube though. The priority is to match ads to users. For this, they're willing to push the envelope on how they use user data. For youtube recommendations, it doesn't seem that youtube gets access to much data from outside of youtube.
Their online selection was enormous ~2010 when they still had the DVD business, and crucially all of the movie studies still thought online was a fad and were happy to cut deals to stream their entire catalogs for pennies. Netflix was unbelievable back then and felt like it had every movie in existence. Over the years all of that back catalog has been clawed back and Netflix morphed into more of a showcase for their own content. But for a few golden years it was amazing to go in and like/dislike a bunch of stuff to see it just start recommending and streaming tons of classic films I always wanted to see.
As of last year when I finally quit, the DVD catalogue was still great, but service was so slow that it costs less to rent movies to stream individually.
Oh man this was huge. You have to remember, machine learning was nothing back then. It was on nobody's radar.
Then comes this flashy $1 million prize. Tons of universities had teams. So it really helped their recruitment.
It also likely contributed to the idea of creating Kaggle which has itself greatly contributed to data-science education by giving everyone an open forum in which to compete.
Then there were other signficant projects around this time like ImageNet which became a competition too. That open dataset led to tons of research and applications.
I've always wondered about this. I first read about this challenge in "Programming Collective Intelligence,"[0] the O'Reilly book from 2007 that drove me to become a professional programmer. It starts like this:
> Netflix is an online DVD rental company that lets people choose movies to be sent to their homes, and makes recommendations based on the movies that customers have previously rented.
It was a different and exciting time back then! I never finished that book but hope to some day... :)
It's a good book! I read it before we started calling it "artificial intelligence" or "machine learning". It was just data mining back then. I think the book and the algorithms in it are still very relevant today.
I was a solver on a different team for the netflix challenge. Our team didn't win the grand prize.
I would have expected that a blog post would discuss how this was structured. Netflix contracted with innocentive.com, which is a website for solvers, and contracting to that website expanded Netflix's reach to a greater available pool of solvers. As far as I recall, all the allowed solvers for the netflix challenge _had_ to go through innocentive. I'm not sure if they would have been able to get the same level of improvement if they had not contracted with a set of potential solver teams like that.
The original challenge listing for Netflix is no longer listed at innocentive.com, but an industrious person may be able to find it on archive.org or somewhere similar.
I don't remember InnoCentive being involved with the Netflix Prize in any official capacity, but I do remember the PR storm they launched after the prize was awarded to make sure they were mentioned in news articles about it.
I signed up for the challenge, through innocentive, in July of 2008.
It's entirely possible that as a new solver at the time I fell for innocentive's PR. The netflix challenge was actually the first I ever signed-up to work.
Note, not just the original listing has been taking down, but also the data used in the challenge.
I believe because a few people in that set actually got de-anonimized. But certainly because of privacy concerns.
I think the data is still kicking around somewhere in torrent land. I also still have my own copy somewhere, I think.
I used the same dataset for a small project in my CS master. It was a really fun challenge, and it taught me a lot.
Most notably, it taught me that it was incredibly hard to make significant progress past the most simplest and naive approach. That approach was "Take average rating a user gives, take the average rating a movie gets, multiply". (Ratings normalized to be between 0 and 1).
Just using this method would give us 95% of the accuracy of our final method. I think I calculated, and compared to the prize winning result, our method got ~90% as accurate a result.
This is an important point about a lot of sophisticated models; you're really fighting for a few percent improvement over simple approaches. Sometimes a basic linear regression will get you 70% there, while a trained neural net will bring that up to... 75%.
A few percent can make a difference, especially in competitive areas; but the biggest win is just getting something in where there was nothing before. It's a bit like optimizing code.
"ROI" is probably the wrong frame for this question. "Worth it" is better. "Better than X" might be better still, since it frames the question such that "X" needs to be defined and quantified.
The benefits of such a competition are pretty nebulous, and there's no way to convince an ardent skeptic. OTOH, many business decisions are like this and skepticism isn't a viable frame in many cases.
Netflix got visibility with investors and potential employees. Netflix's recommendation engine became famous, even though it doesn't seem impressive as a user. The exercise created a structured way of thinking about their recommendation algorithm. They cemented its importance. Even though they didn't implement the winning solution, they did get a useful benchmark. This was potentially very useful in further decisions in R&Ding the recommendation engine in-house.
> are open algorithmic contests useful and valuable?
Kaggle has been around for a long time now. If it works, I would expect them to be pumping out tons of interesting results from winners but I don't think I've heard many stories like that. It seems to be mostly useful for recruiting purposes?
Kaggle competitions rarely produce interesting algorithmic results.
But I highly encourage you to read the winners' solutions. They are full of clever data insight, augmentations, regularizations, feature engineering, and preprocessing and postprocessing tricks.
But above all, compared to the academic literature, it's shocking how much time and creativity they spend on validation. Maybe I'm reading the wrong papers, but the flashy new neural architectures rarely even mention their validation setup; Kaggle winners sometimes devote half of their explanation to it. It's part of their secret sauce.
(2) https://www.kaggle.com/c/ieee-fraud-detection/discussion/111.... Particularly how they reduced overfitting with adverserial validation. They trained a separate model to distinguish between train and test sets, and then dropped features that ranked highly in feature importance on that model. That's probably a well-known technique in some circles, but I had never seen anything like it before.
> But I highly encourage you to read the winners' solutions. They are full of clever data insight, augmentations, regularizations, feature engineering, and preprocessing and postprocessing tricks.
> But above all, compared to the academic literature, it's shocking how much time and creativity they spend on validation. Maybe I'm reading the wrong papers, but the flashy new neural architectures rarely even mention their validation setup; Kaggle winners sometimes devote half of their explanation to it
I agree, but in the end it is a competition, and the solution that scores the most is not always the solution that is "the most interesting" (or practical, or best in real world cases)
Though the details you mention are interesting, and can definitely apply at real-life solutions.
The author is still underselling the significance of the progress made during the first years IMO. The simple idea that is still behind most practical recommender systems (using gradient descent to do SVD to complete the rating matrix) was first described in 2006 by Simon Funk [1]. Koren, who ended up taking home a big part of the prize, recently wrote another paper about how that basic idea still outperforms most “AI” (deep neural) recommenders today [2].
Back in 2010 my wife was doing her master's and Netflix recruited MBA students to help them with their business model for streaming. The task was to understand if DVD made sense and if streaming was too risky. It panned out pretty well and the rest is history
I think the discussion misses the most important part:
The goal of the Netflix prize wasn't to come up with the best algorithm - it was to make the Netflix brand exciting and legitimate to engineers. At the time, Netflix wasn't super high-tech and I'm sure it was hard for them to get the top talent they needed. It seems silly in retrospect now, but I'm certain the reason this was approved was because they wanted the free advertising this would provide within graduate classes and academia in general.
> make the Netflix brand exciting and legitimate to engineers
As a serious question, why do people include Netflix in the acronym FAANG, which I see on HN all the time? Is there something special about Netflix? Netflix is around the #14 tech company, so it's strange to see Netflix in there instead of Microsoft. Or is the use of FAANG divorced from its literal meaning?
Jim Cramer and Bob Lang coined FANG back in 2013 based on this criteria:
"Put money to work in the companies that represent the future," he said. "Put money to work in companies that are totally dominant in their markets, and put money to work in stocks that have serious momentum."
It's probably included because of the sky-high salaries they offer since FAANG is typically an acronym used to refer to top software companies to work for. From what I've heard from friends, Microsoft typically pays the least out of all the companies that make up the acronym and their technologies are also seen as less trendy than the other companies listed.
(Note this article is from Jan of 2008):
> In today's trading, all four stocks are down steeply: Apple: Down $26.65, or 17.1%, to $128.99. Amazon: Down $7.56, or 9.6%, to $70.92. Google: Down $48.15, or 8.2%, to $536.20. Research In Motion: Down $8.47, or 9.4%, to $81.61.
Even writing off RIMM to zero would give you a healthy return through 2021.
Up until the beginning of 2020, NFLX was the highest growing stock of the decade. (Dethroned by TSLA) In terms of percentage growth I believe it still outperforms every other component of FAANG.
it was a originally just FANG with one A. and it was just partially coined because it’s a catchy term. it’s also pretty dated at this point. and if you took the N out it would not be appropriate
I will admit that it was interesting to see what algorithms were poised to be cutting edge in media recommendation. The result was rather disappointing to me.
Netflix STILL isn't that exciting from anything but a compensation standpoint. The problems at netflix are about programming, while the technical challenges are droll at best.
IMO the recommendations are no good because they fundamentally take the wrong approach — rather than ask the user what they like, they try to guess what you like based on usage (which really doesn't correlate well — I watch a lot of garbage because I can’t find things I like, and I don’t have anything better to do.)
And they don’t ask because users don’t provide useful answers.
But users don’t provide useful answers, because rating things doesn’t do anyone any good.
I’m of the belief that if you can make ratings useful (catalogue all movies, including not on Netflix; give useful ways to view/update your lists; have direct relationships to recommendations), you would have dramatically better recommendations for dramatically less effort/complexity.
I don’t think you’ll ever get to “good” recommendations based on usage. The data is fundamentally garbage.
Of course, the other side is that Netflix isn’t interested in recommending things I like; their goal is to recommend things I’ll put up with. They just need 1 show worth watching and subscribing for every now and then, and N shows to keep me mildly amused to stop me from dropping it between good ones
The recommendation system, historically (i.e., in the long-long ago of spinning disks), was insanely good. But then Netflix moved to streaming and, as a consequence, its own--and generally less good--content.
By analogy, Netflix went from being a sci-fi future of having and being able to recommend on the basis of _everything_, to having a handful of good offerings and a huge amount of b-movie-level offerings.
My gut sense is management tried to paper over this "content loss problem" by making changes:
1) to the recommendation system to push Netflix content[1]; and
2) making changes to the UI to force users to be more reliant on the recommendation system.
I suspect these changes have, generally speaking, made user-consumption metrics look decent--in my mind the core of almost all Netflix's post-streaming decisions. But, as you suggest, it is all papering over a problem of user dissatisfaction: Netflix recommends you mediocre content, and you eventually give up and watch it--and then feel meh.
[1] I can imagine Netflix executives being unwilling to report that the content Netflix had paid mightily for scored low on Netflix's own recommendation algorithm. Philosophically, Netflix went from being, essentially, content agnostic (e.g., it just bought more of X DVD), to having incentives to see particular content (e.g., its own) rank highly.
In Mark Randolph's book he talks about how Netflix would recommend content (DVDs at the time) to strategically fit Netflix's needs. For example, if they didn't have a copy of a movie ready to send out, Netflix wouldn't recommend it.
Now a days, I'm certain Netflix recommends content to feature either "no cost" (owned) or the content with the lowest licensing fee. I don't believe for a second they don't have the data suggest the best movie. They simply don't want to suggest the best movie. As you said, their goal (now) isn't to suggest the content the user is likely to enjoy most, it's to suggest content the user will tolerate. And that's exactly why they shifted away from a 5 star rating system, to a thumbs up/down approach... even if you didn't love a movie or show, you're still likely to give it a thumbs up unless it was totally awful.
If you have an Audible subscription, you may have noticed the same behaviour there.
Large numbers of books labelled as 'free with your membership', which likely only cost Amazon the price of delivering the files. Which makes sense, because once I have paid for my credit the worst outcome financially is that I use it.
Who's more likely to keep renewing their subscription? The person who uses Netflix to watch a ton of trash that they think is "just ok," or the person who merely watches 1 or 2 things per month that they actually enjoy?
I'm certain Netflix ran the numbers, and determined that a high-usage customer is the most valuable.
It's interesting how many corporations don't actually "run the numbers" on what we think are important issues. Basically, internal focus and what the rest of the world cares about are disjointed and corps are often blind to obvious aspects. This can be improved by strong internal diversity, but Netflix doesn't look like a bastion of that (yet?)
On "just ok" vs stuff actually enjoyable, "just ok" is fine until there is no better competitor for attention (e.g. a new smartphone game takes over the world). If they get to fit on the "actually enjoyable" scale instead, there is a better chance for people to keep their subscription, sometimes even if they end not viewing anything that month for whatever reason.
Former Netflix employee (2010-2013) here and, if there's one thing Netflix does as well or better than anyone in the industry, it's running the numbers. In particular, in those days, we had two key metrics that were strongly correlated and we would attempt to drive up: Streaming hours and account retention. Higher usage was strongly correlated with account retention to the point that they were the core of nearly every experiment we did.
I think these are reasonable numbers to focus on, but other relevant variables could be just too hard to quantify or set as goals...
For instance how Netflix's catalog is attractive to new users/markets can be checked in regular polls, but it would be way more difficult to follow with fine granularity, far less precise, and ultimately a harder to handle number than just retention or number of new accounts.
This means Netflix could see decent growth on its numbers, good retention and a steady flow of new accounts created, while struggling to reach new markets where competitors are doing great.
This is an extreme example, but Blackberry typically had very good user retention and users loved their devices. Looking only at these numbers, they were doing fine for a long time (which is nothing to sneeze at)
Wouldn't this be a short term vs long term optimization thing? In the short term "just ok" wins. In the long term, users might get bored and new users are less likely to join. Or at least that's my gut feeling, i have nothing to back it up.
Not necessarily, the power of habit can be very strong.
The users who only watch a couple of things are the ones who are more likely to “get bored”, because in any given month there is a higher chance that there won't be any single thing they'd want to watch. Whereas someone who just does it regularly (say every day after work while eating dinner or w/e) is more likely to keep that habit.
Netflix doesn't run ads for anything but their own content, right? It would seem to me their best customer is one who pays their monthly sub and then never uses the service.
On the other hand, the only platform that provides me with good recommendations to watch things seems to be TikTok. They are not asking me to rate individual videos, and so on. Clearly, there is a way to do recommendations without "ratings".
I think you've hit the nail on the head here. There's no good way to express how I feel when I'm watching a show or at the end of the show. There's no "Holy shit, this is amazing" vs "This is decent" etc - where sentiment is clearly attached to the rating. A 5 star or 3 star rating scale alone isn't quite good enough..
I don’t think that’s correct. 1-5 stars is sufficient. The problem is that you need reason to continuously update the values as your preferences update over time (what was once a 5-star is now a 4-star, because that last movie I saw was phenomenal)
What you need is sufficient reason to do so — the values need to actually be useful to you to make updating an act of sanity (unlike now, where it’s purely an act of futility). Feeding the algorithm is not itself sufficient (though necessary, and currently ineffective). The ideal recommendation system would encourage rating entry as a ritual act, and more importantly, rating updates an act that derives real value.
Only then will you have good data, and from good data, a dumb algorithm will suffice.
The problem is data entry for the recommendation algorithm is insufficient incentive to constantly use it (thereby providing “truthful”, or highly-correlated, user ratings). The ratings themselves must be directly beneficial to the user, so that the user provides truthful data for their own benefit, and secondarily for the recommendation algorithm.
That is, I’d like to catalog my own list of watched movies, and their relative ratings, so that I can have a useful system (or a direct relationship to recommendations — eg More Like This), from which Netflix can scrape for their algorithms.
That is, if I’m not honest to myself, the ratings themselves will not be honest, and not properly reflect my taste.
Specifically, there must be reason to provide negative ratings in addition to positive, to capture user taste.
Or, like, maybe just let me turn off the auto recommendations if I want to? It makes me actively uncomfortable to think that everything I watch on Netflix is going to change what I see in the future!
MAL was actually my source for thinking on this subject. In combination with the book Otaku: Database Animals[0] (anime fans catalogue the hell out of things, and this extends to tracking their anime and ratings) I realized you should be able to put together some very strong recommendations by scraping the MAL dataset — because the data should be fairly honest.
And then the realization that really the best recommendation isn’t to forge a new customized list altogether — it’s to simply find the most similar users and recommend items from their list. (MAL has/had a cosine similarity function for this, but no way to search because it’s basically an n^2 algorithm on 4M users; apparently they offered it at some point, and quickly found it untenable. That was what really kicked me off)
And then the realization that if I found users with similar taste, then shouldn’t they be friends? So then it becomes a MAL friendship algorithm..
Did a bunch of research on recommendation algorithms and weighting strategies, scraped most of the MAL users, stored it in a database, and then promptly procrastinated on actually implementing the algorithms. Been sitting on that for like 3 years now :|
You may watch garbage (revealed preferences) but that is more important to them in terms of keeping your attention than your wish list (stated preference).
That’s my point. The algorithm goal is not to find what I’d like, but rather what I’d put up with. They only need to find what I’d like every so often, to keep me on the platform.
It’s correct from Netflix’s perspective, but not from mine.
Yeah, I'm sure being responsible for 10% of the internet's traffic is trivial on a technical level, together with all the codec and video encoding engineering they are doing.
> I'm sure being responsible for 10% of the internet's traffic is trivial on a technical level
> the codec and video encoding engineering they are doing
I am confident that is not what the vast majority are working on. Those aren't constrained problems that a single developer would be responsible for. Making unrealistic statements, is more than a little disingenuous.
It was my understanding that there was significant business value in improving the accuracy of personalized movie recommendations. Recall that this was at a time where the majority of the business was DVDs sent via mail. A poor choice of movie created significant risk to customer satisfaction and hence retention.
IIRC they announced pretty quickly that they wouldn't be using the winning solution due to complexity for little gain. A common thought at the time was that it was worth it because it showed how far they would have to go to make their algorithm better and that it wasn't worth it.
This prize was the biggest deal in the tech world back in ~2006-2010. They made way more than $1 million back in advertising and engineering recruitment alone, even if they didn't use the winning algorithm at all.
I’d enjoy seeing a more creative approach to recommendations for something as big as Netflix.
A few suggestions:
1. Review channels by genre
2. Trailer TV - let me leave a “comedy” trailer channel running that shows the trailer and movie rating and details at the bottom, let me easily skip to the next trailer (or let it play out)
I think Netflix doesn’t have enough content to do this. They have their own originals and just a thin layer of other stuff.
So I think any truly personalized content channel would get exhausted quickly.
What I’d like to have is just a channel of curated or semi curated movie content that I can leave running or forward through to watch.
I recently stayed in a hotel with 6 channels of hbo. It’s kind of refreshing to have “hbo comedy” with random stuff like Beverly Hills cop and billy Madison on at 2pm in the afternoon.
Netflix doesn’t have enough content to do this, so they keep recommending the same crap originals to me over and over, knowing that I don’t watch them.
Anything can beat their current algorithm of "push people towards our originals, preferences be damned".
EDIT: I should add in terms of customer satisfaction, not revenue. I am sure forcing their originals down people's throats is great for their revenues.
Well yeah, that's what is in Netflix's interest, but there's a reason many companies succeeded by focus on improving the user's experience. I guess Netflix has a a different approach.
That’s some great guerrilla marketing (or a business hack). At best, they actually get a better (production) algorithm. At worse, they get thought leadership among the ML/AI community and insane publicity still paying dividends today. Reed Hastings is a business savant.
I haven't been a Netflix customer in some years, but at the time their algorithm was pretty obvious: recommend only Netflix-produced content. It's the elephant in the room here.
Recommandation algorithms are massive sales driver if you depend on long tail. This was true for back in the days and long term subscriber retention would almost entirely depend on great recommandations. Unfortuately, Netflix figured out that they need not depend on long tail to retain their market value. Instead the new idea is same as HBO model. Now the recommandations are basically garbedge.
Netflix suggestions used to be amazing, but sometime after their contest they totally changed how it works and they became useless. This article maybe hints at why with a short “predicting consumption was more important than ratings anyway”
why? what does he mean? netflix had a killer advantage with the old rating system and then dumped it why?
There's a subtle but valuable (for Netflix) difference between "you watched x so you'll probably like y" vs "if we can get you to start watching this show you'll probably pay another month's subscription".
Only Netflix content is worse than their algorithms. I've completely stopped surfing the platform and stick to my own watchlist or search for specific movies based on the recommendation lists out there.
The article doesn’t emphasize the impact made by new technologies invented in pursuit of this prize. For example, this contest was a motivating factor for the creators of Apache Spark.
What’s odd is that the person responding in Quora who’s speaking as if he has authority on this matter started at Netflix in 2011, 2 years after the contest ended.
The contest started in 2006 and was awarded in 2009.
If he started two years later and there was not a trace of the Prize work at the company, that would be an indicator that the competition was not important. If he started and could still see knock-on effects from the competition, that's an indicator that it was important.
Plus, he didn't just start at Netflix. He "took over the small team that was working and maintaining the rating prediction algorithm that included the first year Progress Prize solution."
Yeah, that sounds like he has some authority on the matter.
Could it not be something more simple like, Netflix didn’t originally have profiles. So my child watching kid shows and me watching action shows were all feeding into the same recommendation system resulting in subpar results. That and they originally had a 5 star rating system which they then dropped.
Its very possible Netflix realized they needed to course correct the UX and as a result the winners algorithm was solving for a problem that no longer applied because it was using assumptions (rating system & no existence of different profiles) that were no longer relevant.
The 2007 winner was still in production in 2011. I'm not sure that qualifies as "productionizing" as that word hints of 1st pass work, but he certainly has production experience with it.
"Research/Engineering Director at Netflix (2011-2014)" would probably know a thing or two about how useful the contest was. (He also might have worked there before that time in a different role.)
Quora's culture is odd in general. Writing "authoritative" (even they're just LARPing as an expert) is basically why people go there in the first place. It's not uncommon to read something egregiously inaccurate written as indisputable fact. Good examples can be found in anything to do with history.
TLDR: Very useful. It's true they didn't use the winning algorithm, because it was only a small improvement and they were already moving to streaming, where predicting consumption matters more than predicting ratings. However, they did put an earlier submission into production. Moreover, the competition was a powerful recruiting tool.
- If I watched a movie and gave it a thumbs down, don't recommend it
- If I watched a movie < 1 month ago, don't recommend it
- If I browsed over a movie 50 times, read the info, and still didn't play it, stop recommending it.
- If I watched the last episode, remove the "new episodes" banner.
WTF