Hacker News new | past | comments | ask | show | jobs | submit login

> SerpAPI: Scrape Google and other search engines from our fast, easy, and complete API.

Does anyone know how this works behind the scenes at scale? How do they get around Google trying to stop their scraping attempts? Just lots of proxies and some way around captcha challenges?




probably a combination of realistic-seeming desktop browsers (eg. headless chrome with stealth patches) and residential IP providers (eg. luminati)


Their website says

>In addition, each API request runs in a full browser, and we'll even solve all CAPTCHAs. Mimicking completely what a human will do.

Wow how would they do that?


Is this not against Google's terms of service? Could they not make legal threats? And ban the company from using all Google products like Gmail and ads?

I know there's companies doing similar things and I'm not saying they should get in trouble, but it feels so risky basing a business around it, unless I'm missing something. Lots of companies seem to do similar scraping to get SEO data for example that Google probably has an interest in preventing.


Not sure, but maybe they use a CAPTCHA-Solving-API like https://anti-captcha.com


Yep. Has to be using a large net of residential addresses in each country to not get banned by Google.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: