Hacker News new | past | comments | ask | show | jobs | submit login

<div class="hidden">Actual name: Batman</div>

Most explicit CSS rules allow you to spot this, implicit rules won't and possibly can't.




:) Agree, but the scraping arms race is way beyond that, if someone doesn't want their page scraped this isn't a threat to them.


Has it? Can you give me an example of a site that is hard to scrape by a motivated attacker?

I'm curious, because I've seen stuff like the above but of course it only fools a few off the shelf tools, it does nothing if the attacker is willing to write a few lines of node.js


Try Facebook, I've spent some time trying to make it work but figured out I can do what I need by using Bing API instead and get structured data...


i guess the lazy way to prevent this in a foolproof way is to add an ocr somewhere in the pipeline, and use actual images generated from websites. although maybe then you'll get #010101 text on a #000000 background




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: