Hacker News new | past | comments | ask | show | jobs | submit login

No, submitters don't get neither security nor what captcha is for.

codinghorror.com (iirc) was using "captcha" which never changed, you just had to enter "orange". Or was it tbray.org? Anyway, Atwood claims, that naive approach was 99.9% effective: http://www.codinghorror.com/blog/2006/10/captcha-effectivene...

My position is that ANY form of captcha which requires some action by visitor is broken by design and should not be used at all.

Simple captcha like this will stop most of the automattic not targeted attacs. And if someone decides to write CAPTHCA breaker specifically for this site, nothing can help—then you either degrade to the level than even humans cannot say what characters are on the screen or it is cheeper to hire Mechanical Turk to do the job.




I think that's completely besides the point. Sony actually bothered to make a couple of plain-text letters look like a captcha image just for the sake of it. That's just a clear misunderstanding of what these things are for and why they are the way they are.

As to your point, Captchas have a completely legitimate use-case that you can sometimes see on Google and Amazon. If they start to have some doubts about whether you are a human being they will present you with a Captcha image to solve. Otherwise, you should be regarded innocent until proven guilty.


Also, people are probably not going to go out of their way to tune their spambots to work on a personal blog that's just one of a billion tiny little targets, so something like a static "CAPTCHA" makes sense. But grandparent implies that this extends to the websites of large companies for which people have a lot to gain by creating specialized bots, which is pretty obviously not true.


There are two objectives for captchas:

1. Prevent prefab'd spam bots from being able to spam your site. Almost anything, including trivial questions and Sony's "captcha", works for this. Some bots will randomly try values for unknown fields, and note if a value apparently succeeds where others failed; against such bots, questions like "2 + 2 = " or "type 'orange':" are bad. A decade ago, people were writing irc bots that could answer most trivial captcha questions. A question like "Who was the U.S. President in 1993-2001" has a better chance of going uncracked.

2. Prevent a targetted spam bot from being able to submit arbitrarily many forms on your site. "orange", "3 + 5 =", 31337e^(-pii), and sony's captcha will not work for this. If there's a question pool, someone will solve most or all of them and program the bot to respond correctly, most of the time / all the time.

The idea that someone would only ever want to do (1) is flawed. If someone's bot is flexible enough, and your question pool is shallow enough, it may be worth their time to code in responses to your custom questions even if your site caters to a relatively small audience.


> A question like "Who was the U.S. President in 1993-2001" has a better chance of going uncracked.

http://www.wolframalpha.com/input/?i=Who+was+the+U.S.+Presid...


That's a pretty dishonest link there. The question/answer you linked isn't the same question posed.

When the question as originally posed is asked to Alpha, it barfs with no answer:

http://www.wolframalpha.com/input/?i=Who+was+the+U.S.+Presid...

One could argue that the leap from form one to form two of the question is trivial (and it is for a thinking human), but if such a thing were so trivial for a machine then why doesn't Alpha do it? And if Alpha doesn't do it, what makes you think a random spambot will be capable of doing it?


It seems you don't understand what CAPTCHAs are for on a large site. The purpose of a CAPTCHA is to make it expensive for automated systems to make fake accounts on your site. If the CAPTCHA requires them to hire people via Mechanical Turk to solve it then it has done it's job. If it requires a one line JQuery statement then it is a complete #fail.


My blog simply fills in a hidden value using javascript when you press the submit button, it has stopped 99% of all spam for me.

You're right, captchas aren't supposed to be difficult, they're just supposed to prevent automation.


Captchas have 2 purposes. First, prevent simple fly-by bots from spamming sites. Case in point, blog software is dominated by a few big players (e.g. wordpress) and many blogs allow anonymous comments, without something like a captcha it's a simple matter to create a spam bot that runs through a list of site's and spams ads or what-have-you in comments.

Second, captchas are designed to prevent automation entirely, including custom made automation targeted to a specific site. This sort of thing is less important for, say, blog comments since the value of a typical blog comment is extremely low. But there are lots of free accounts out there, for example, and if you use automation to set up new accounts you may be able to game certain systems to your advantage, corrupting the normal process of the market.


(And on a related note, this comment form is damn near unusable on Android in portrait mode. I had to make this a separate comment because the box wouldn't scroll to show what I was typing...)


Actually, with reCAPTCHA you can now get a prefab solution that can't be cracked.


reCAPTCHA, after extensive deployment in production, now becomes extremely difficult to solve from time to time. It just renders impossible to recognize scan images. User frustration is serious concern. If you care about conversion rate, DO NOT USE RECAPTCHA.


Yeah, one time I got an image of a sum. So I just put in the LaTeX equivalent, something like \Sigma_{n=1}^{100}, and it accepted it.


Since OCR was only able to parse one of the words, they have no way of checking the other (besides taking votes from a ton of submissions). Because the summation was not the word that was checked, you could enter anything.

It's an interesting problem. If they make it too clear that they don't know the right answer, quality of submissions will plummet. If they keep the current form, then usability suffers.


There's actually a game of this on 4chan.org/b/ there's a pattern to knowing which word is known and which isn't because the known word is put through a recognizable filter and the unknown one isn't (most of the time). They then put in a specific racial slur in for the unknown word, hoping that just once it'll slip through enough that it makes it into the final OCR'd text.


If you succeed at the control (the one it already thinks it knows) it takes your other answer as input for future viewers.


Only if multiple people give the same response, though.


If captcha system isn't reliable, a new system of deterring spam registration is in order.. maybe the system uses images to direct the user to move the mouse in a certain pattern and then use mouse gesture detection to ascertain if its human..


This is a BRILLIANT idea!!


What? reCAPTCHA is very predictable. There is one word that always looks the same. It's the word, and then the word blurred and slightly skewed and placed on top of itself. The other is a scanned word (or somethings symbols or mathematical equation). That's the bit where they're crowd sourcing book scanning.

You don't have to get it right, and in fact you can type in gibberish if you like, as long as you get the known one right, and the known one is very easy for a human to read.


I know, I know. I like the idea of reCAPTCHA to crowdsource a very hard problem for the common good. I know how it works in and out. It is awesome.

The only problem: things sound nice in theory do not always work well in practice. I ran into multiple occasions where reCAPTCHA presented me impossible to recognize scans, one extreme example is a printed texture pattern with not a single letter on it.

Your average, non-tech-savvy customers will never know how the system works. They shouldn't either, otherwise the whole idea of reCAPTCHA would have been defeated.

Judging by the volume of complaints about it on various support forums, it is really a bad idea to deploy it in production when you care even slightly about usability, user experience, or simply conversion rate.

Unless your application is a must-have.


Interesting, I'd never really given it a second thought. I appreciated the drop-in ease bit and hadn't considered its effect on conversion.



It looks like they've artificially weakened it:

> reCAPTCHA uses a “one-off” system. That means a letter in a word can be incorrect, and it will still be accepted by the system.


It is also only verifying 1 word out of the pair; usually the more garbled one.


<!-- Layer contains table with 5 cols (width is divide by no of chars) contains Captcha Chars -->

It was however, nice of them to label the spot where my breaker should scrape the characters. (Bonus points for that label being pseudo-code for my scraper)


There is a big difference between using CAPTCHA against bots / SPAM and using CAPTCHA to stop something like a brute force attack (see nice implementation by gmail, showing CAPTCHA after several incorrect password attempts. Nice compromise).


No, clownery like this should be exposed.


"orange" was codinghorror's.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: