No proxy yet, but I am considering one as many sites are re-directing my crawler...

busymom0 · 2025-08-06T20:54:31 1754513671

Do you have multiple IPs? I am trying to build something which needs just the published at and updated at date fields for thousands of links and I am afraid my IP will get blocked quickly.

saltysalt · 2025-08-07T13:26:44 1754573204

Just one IP for now. You are right to worry about being blocked from crawling however, it has happened to me already on a few sites. The key things to help mitigate against this are:

1. Always identify your crawler via a consistent user-agent string, that explains its a web search crawler and not a generic web browser.

2. Always obey the directives in robots.txt.

3. Make sure your crawler is not too aggressive (low frequency of requests).

(updated for formatting)