Hacker News new | past | comments | ask | show | jobs | submit login

Based on a quick google search (e.g., https://stackoverflow.com/questions/46673751/nutch-vs-heritr...), their existing product appears to be a hosted solution for crawling the web.

This new product sounds like it is just a query language that can be used on top of what you yourself have paid them to crawl. I don't believe they've actually crawled the whole web and are providing an interface to that. Their website says things like "the entire web" and "trillions of rows", but I'm guessing that's only true if you pay them a few million dollars to do that.




I guess, there are using common crawl as a base. Not sure wether they are doing actual crawling along with it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: