Hacker News new | past | comments | ask | show | jobs | submit login

I used the same dataset for a small project in my CS master. It was a really fun challenge, and it taught me a lot.

Most notably, it taught me that it was incredibly hard to make significant progress past the most simplest and naive approach. That approach was "Take average rating a user gives, take the average rating a movie gets, multiply". (Ratings normalized to be between 0 and 1).

Just using this method would give us 95% of the accuracy of our final method. I think I calculated, and compared to the prize winning result, our method got ~90% as accurate a result.




This is an important point about a lot of sophisticated models; you're really fighting for a few percent improvement over simple approaches. Sometimes a basic linear regression will get you 70% there, while a trained neural net will bring that up to... 75%.

A few percent can make a difference, especially in competitive areas; but the biggest win is just getting something in where there was nothing before. It's a bit like optimizing code.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: