Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Grad Students: How much of your time is spent building datasets?
4 points by agibsonccc on Sept 11, 2013 | hide | past | favorite | 1 comment
I'm running a pilot project with my former university faculty as a client. Datasets I've been told are a huge problem with regards to man hours.

Disregarding data that comes from sensors and the like, how much time is spent collecting data from other sources?

My current test case are patent applications. I want to get out of the echo chamber a bit and get some external feedback.

I know how to automate data collection to answer questions like that and was wondering what other pains might be out there.

Thanks!



I'd appreciate any feedback possible. I know dataset building can be a pain that would be best left to software (obviously accuracy plays a huge factor here, but done right: could be huge)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: