Hacker News new | past | comments | ask | show | jobs | submit login

You can always get around this by throttling your web crawler. It will take a much longer time, but at least you'll be able to read HN in the meantime.



The tricky thing when doing this is knowing what rate to stop at without getting permanently banned. I built an Android Market crawler two summers ago, and luckily Google only temp bans (from my experience), so that might be an easier project without any risk.


Respecting robots.txt is probably the best plan.


Use disposable IPs.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: