A large fraction of our http requests come from Yahoo crawlers--
almost 20% yesterday. Their crawler seems significantly stupider
than Google's. Yesterday we got 12,423 requests from the Google
crawler, of which 4148 were for x (= mostly useless) urls, and
43,087 requests from Yahoo crawlers, of which 30,652 were for x
urls.
Unlike most sites, I'm looking for ways to constrain our growth. News.YC is deliberately not intended to become a
massively popular site.
So the thought occurred to me: why not just ban Yahoo crawlers?
And MSN too, while we're at it. I don't know anyone who uses either
of them for search. I'd just as soon have the site be invisible
to them. But what does the community think? Do any hackers use
Yahoo or MSN search?
As "evil" as blocking sites and crawlers may sound, I think these types of measures will be necessary to preserve the quality of content here. Whatever actions further that objective have my vote.