Hacker News new | past | comments | ask | show | jobs | submit login

For the specific use-case of "badly written scrapers", this might be reasonable, but usually by the point when engineering needs to care about scrapers, other people at the company are involved and just view it as a service theft issue. i.e. "Why waste time and money forcing people to scrape fairly when we can just ban all scrapers?"

Not to mention, actually malicious traffic will find any non-Sybil criterion you use to enforce rate limits and work around it. "Enforce rate limits per User-Agent?" I'm now 10,000 different applications. "Enforce rate limits per IP address?" I'm now 10,000 different compromised residential IP addresses. At some point, distinguishing between well-behaved, buggy-but-legitimate, and outright malicious automated traffic is either impossible or too time-consuming. Upon which point you throw up your hands and say, "Screw it, everyone but Google or a browser is banned."




> "Screw it, everyone but Google or a browser is banned."

Thanks! Why don't malicious actors just spoof browsers?

More generally, I would think that any defense that prevented malicious actors would prevent badly written scrapers, simply because malicious actors can do anything a badly written scraper could do, but can also take more active steps to evade defenses.

(These are honest questions; I have very little knowledge about this.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: