I used the same dataset for a small project in my CS master. It was a really fun challenge, and it taught me a lot.
Most notably, it taught me that it was incredibly hard to make significant progress past the most simplest and naive approach. That approach was "Take average rating a user gives, take the average rating a movie gets, multiply". (Ratings normalized to be between 0 and 1).
Just using this method would give us 95% of the accuracy of our final method. I think I calculated, and compared to the prize winning result, our method got ~90% as accurate a result.
This is an important point about a lot of sophisticated models; you're really fighting for a few percent improvement over simple approaches. Sometimes a basic linear regression will get you 70% there, while a trained neural net will bring that up to... 75%.
A few percent can make a difference, especially in competitive areas; but the biggest win is just getting something in where there was nothing before. It's a bit like optimizing code.
Most notably, it taught me that it was incredibly hard to make significant progress past the most simplest and naive approach. That approach was "Take average rating a user gives, take the average rating a movie gets, multiply". (Ratings normalized to be between 0 and 1).
Just using this method would give us 95% of the accuracy of our final method. I think I calculated, and compared to the prize winning result, our method got ~90% as accurate a result.