This is awesome. It would've have been super useful building our mobile interface to Philadelphia's regional rail schedules where we scraped SEPTA's site for data. http://www.septa.org/service/sched/r2s_wk.html is an example of the markup we had to deal with.
I think I'll have to go back and crosscheck your selectors with what I came up with to see if we can't improve anything. Thanks!
Wow, this made my day. I've been using FireQuark (a custom firebug that provides CSS selectors - http://www.quarkruby.com/2007/9/5/firequark-quick-html-scree...) and while it was very helpful, SelectorGadget will remove sooo much more of the pain of scraping. Major kudos.
As an aside, I'll mention the nokogiri gem (almost identical to hpricot, but under much more active development), that's what I've been using and it's excellent. http://github.com/tenderlove/nokogiri/tree/master
Just so you know, there was, and may still be, an off-by-one error in Hpricot's implementation of nth-child that may confuse you when using the selectors generated by SelectorGadget.
Great point. I haven't looked, but I think nokogiri fixes that bug. I think that was one tenderlove's motivations for creating nokogiri. He talks about it more on his blog:
"I just want to tell you that you shouldn't worry about that old legacy code that uses Hpricot. Nokogiri can be used as a drop in replacement! Really! Nokogiri doesn't reproduce the bugs that are in Hpricot, but should work in most cases. Just use "Nokogir::Hpricot()" to parse your HTML. Of course, I've tried to keep the syntax of Hpricot that I like."
This is the kind of stuff that blows my mind. Between all of the people who will ever use it, you've probably saved a lifetime's worth of fiddling with CSS selectors. Kudos to you, sir!
Very nice tool. Reminds me of the Dapper interface. I could see this being used to build some sort of notifier service. For example, "notify me when any of these change."
Edit: After using selectorgadget for a couple of minutes I found a case where selectorgadget works much better than dapper.net. selectorgadget is great!
Amazing timing on this release- I just learned the basics of CSS the other day, and this is going to be a really helpful tool to keep the learning process going. Well done and thanks!
OK, I know this is off topic, but... I read this article an hour or so ago, and I have had the Inspector Gadget theme song stuck in my head since then.
awesome. seems to work in chrome just fine. I can't see how this is useful for scraping though, is there scraping software that can take css selectors as criteria?
I've previously just used mechanize/hpricot and copied the xpath from firebug to get the areas i want to scrape.