Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Some Datasets Available on the Web (datawrangling.com)
45 points by Anon84 on Feb 12, 2009 | hide | past | favorite | 6 comments


Thanks to whoever submitted this. It is an old post, but I just updated it this week with 230 new dataset links I collected in the last year. The page has about 400 datasets listed now. It also includes a tagged json version for groups like infochimps who are working to categorize open data or scrape it from the web.


Peter, any chance you could organize them, or tag them in some way? Otherwise, there is no way to know what any data set is until you click the link.


Yes, I'll take a pass at sorting these into groups this afternoon or at least add more descriptive information. Some of the original link titles are a bit mysterious, like this one: "Voter registration data; or, HERE IS YOUR HOPE, YOU FOOLS! « The Edge of the American West"

I'm going to toss the ball into infochimps court for any deeper organization of the data, since they are already set up to handle it.


I've added static tags to each link which should help a bit, If I can find a way to quickly cluster them together into logical groups, I'll so that as well.


this is awesome. thank you.


Here are some other dataset aggregators:

    http://theinfo.org/
    http://infochimps.org/datasets
    http://ckan.org [Comprehensive Knowledge Archive Network]
    http://www.trustlet.org/wiki/Repositories_of_datasets
    http://www.daniel-lemire.com/blog/data-for-data-mining/
    http://www.quantlet.org/mdbase/
    http://datamob.org/
    http://freebase.com/
    http://infochimp.info/ics/data/ripd/www-personal.umich.edu/~mejn/netdata/
    http://infochimp.info/ics/data/fixd/
    http://infochimp.info/ics/data/pkgd/
    http://infochimp.info/ics/data/rawd/
    http://www.archive-it.org/public/all_collections




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: