Hacker News new | past | comments | ask | show | jobs | submit login

This is certainly an interesting problem,but without having some access to the data I don't think I could really approach it.

Perhaps I am alone in saying this, but I think data mining is interesting while web crawling is boring. Could somebody make the data available so that we don't have to write a crawler? Or is this part of the challenge?

I think this is a classic example of unsupervised learning, for which I would generally use a system like Fuzzy ART. I think that might perform better than a naive Basyesian text classifier though I can't be sure until I try it out.




If anyone wants to use 80legs for this challenge, just drop us a line at http://www.80legs.com/contact.html. We might be able to set up some custom free plans.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: