Hacker News new | past | comments | ask | show | jobs | submit login

Yes, Scrapy is quite a good scraper technology for some features, especially caching, but for some websites it's like doing things the hard way...

The easiest scraper with a proxy rotator I've found is in my current fave web-automator, scraper scripter and scheduler: Rtila [1]

Created by an indy/solo developer-on-fire cranking out user-requested features quite quickly... check the releases page [2]

I have used (or at least trialled) the vast majority of scraper-tech and written hundreds of scrapers since my first VB5 controlling IE then dumping to SQLserver in the 90's and then moving to various php and python libs/frameworks and a handful of windows apps like ubot and imacros (both of which were useful to me at some point but I never use those nowadays)

A recent release of Rtila allows creating standalone bots you can run using it's built-in local Node.js server (which also has it's own locally hosted server API you can program anything else against using any language you like)

[1] https://www.rtila.net

[2] https://github.com/IKAJIAN/rtila-releases/releases




I'm sure Rtila is fantastic at what it does, but I gotta say it's hilarious to see a landing page done in the Corporate Memphis artstyle but worded in euphemism: https://www.rtila.net/#h.d30as4n2092u

"‘Cause if the web server said no, then the answer obviously is no. The thing is that it’s not gonna say no—it’d never say no, because of the innovation."


That is a modified quote from Its Always Sunny In Philadelphia BTW, if you didn't recognize it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: