Hacker News new | past | comments | ask | show | jobs | submit login
Data for 2009 + 2010 March Madness. Can your algorithm predict the tourney? (smellthedata.com)
20 points by danger on March 10, 2010 | hide | past | favorite | 7 comments



We had a (small) Hacker News fantasy league for March Madness last year -- the only rule was that your picks had to be by some algorithm which you shared after everyone made their picks. I'd be happy to set up one for this year if there's enough interest.


That sound fun. I think the rule should be a bit more hardcore, though: that the predictions have to come from raw data. i.e., no meta-algorithms that use information about seeding or expert predictions, but if somebody wanted to gather, say, play-by-play data and use that, it'd be ok.


Ok, I've created a group called "HN" on Yahoo! Sports.

Here's the link: http://y.ahoo.it/mVPMVA8X


How is one supposed to run any sort of machine learning algorithm with only two seasons of data? I could understand throwing the stats from the last 15-20 seasons into Weka and seeing what it said about 2010, but seriously how useful is only 2 seasons worth of data going to be?


The data there has the scores from ~5000 games played over the course of each season, and the model he links to also seems quite reasonable to me: http://blog.smellthedata.com/2009/03/data-driven-march-madne...

Don't think of it as two data points. Think of it as two data sets.


And being college teams, their ratings can change drastically over the course of more than a year or two...


Great idea! Looking forward to seeing your predictions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: