Hacker News new | past | comments | ask | show | jobs | submit login

"In an attempt to learn where the solvers live, Savage et al. sent out specially fabricated CAPTCHAs with images of words in various languages. [..] But one organization showed exceptional linguistic versatility, even solving challenges in Klingon."

This is a rather surprising finding.




I combed through the paper and found this paragraph for you guys, the organization only answered two Klingon ones correctly:

---

Finally, the results for ImageToText are impressive. Relative to the other services, ImageToText has appre- ciable accuracy across a remarkable range of languages, including languages where none of the other services had few if any correct solutions (Dutch, Korean, Viet- namese, Greek, Arabic) and even two correct solutions of CAPTCHAs in Klingon. Either ImageToText recruits a truly international workforce, or the workers were able to identify the CAPTCHA construction and learn the correct answers. ImageToText is the most expensive service by a wide margin, but clearly has a dynamic and adaptive labor pool.


I came here to post the same. Has someone written an algorithm than can recognize an alphabet and solve any CAPTCHA based on an alphabet it knows (maybe all of Unicode)? Good god....


They were trying to measure the worker's native language by assuming error rates would go down. One provider had low error rates across the board - the easiest explanation for this is a different incentive system that promoted low error rates, likely at the expense of speed. I'd guess this was one of the expensive service providers, and is how they try to differentiate themselves in the market.


Wouldn't they throw aside a Klingon sample rather than spend time trying to decipher it? CAPTCHAs are supposed to expire after a minute or two.

I'm assuming the Klingon was written in the Klingon alphabet, though, which (as I realized after reading Tycho's post) could be wrong. If it was written in the Latin alphabet, then a person could easily have solved it letter-by-letter in a few seconds.


Yeah. I was assuming it was in a latin alphabet but I was wrong. I just read the paper, it was in a klingon alphabet, so essentially arbitrary symbols. So my guess was wrong, it's difficult to see an explanation here that makes any financial sense. Apparently they concluded that they were other researchers? I suppose you could imagine the results in some sort of auction system that ended up paying a lot (more than they were making) for problems no one else was willing to solve. But that doesn't seem too likely.


wouldn't they be Klingon phrases written with normal letters? i mean, nobody has a Klingon keyboard, do they?


There's a pretty common transliteration system for writing Klingon using latin characters. http://en.wikipedia.org/wiki/Klingon_writing_systems


The hard part is to insert more "mistakes" on the CAPTCHAs that are supposed to be tougher.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: