Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thousands of requests per hour? So, something like 1-3 per second?

If this is actually impacting perceived QoS then I think a gitea bug report would be justified. Clearly there's been some kind of a performance regression.

Just looking at the logs seems to be an infohazard for many people. I don't see why you'd want to inspect the septic tanks of the internet unless absolutely necessary.



Depending on what they're actually pulling down this can get pretty expensive. Bandwidth isn't free.


I love the snark here. I work at a hosting company and the only customers who have issues with crawlers are those who have stupidly slow webpages. It’s hard to have any sympathy for them.


Isn't it part of your job to help them fix that?


How? They are hosting company, not a webshop.


We were only getting 60% of our from bots at my last place because we throttled a bunch of sketchy bots to around 50 simultaneous requests. Which was on the order of 100/s. Our customers were paying for SEO so the bot traffic was a substantial cost of doing business. But as someone tasked with decreasing cluster size I was forever jealous of the large amount of cluster thatwasn’t being seen by humans.


One of the most common issues we helped customers solve when I worked in web hosting was low disk alerts, usually because the log rotation had failed. Often the content of those logs was exactly this sort of nonsense and had spiked recently due to a scraper. The sheer size of the logs can absolutely be a problem on a smaller server, which is more and more common now that the inexpensive server is often a VM or a container.


i usually get 10 a second hitting the same content pages 10 times an hour, is that not what you guys are getting from google bot?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: