Hacker News new | past | comments | ask | show | jobs | submit login

Will it be a good fit if I, running on a hundred servers, need to scrape just the home page of a million sites? No analysis of the pages, that is done later.



The fetcher fit you already...


You are running

   phantomjs phantomjs_fetcher.js
and using it as proxy? The setup instructions are a bit unclear on this.


I want to make it a http proxy in the beginning. But I found it hard to do so. Then I post every to it, but haven't change the name.

But it works like a proxy, that any request with `fetch_type == 'js'` would be fetched through phantomjs and the response back to tornado_fetcher.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: