Hacker News new | past | comments | ask | show | jobs | submit login
Sony's Captcha. View source to see why they don't "get" security (sony.com)
211 points by bluesmoon on July 23, 2011 | hide | past | favorite | 82 comments



Why are they disabling right click? I mean, captcha is there to stop bots. Bots have no 'right click'.

My guess is that the programmer didnt really know the reason for using a captcha. He has seen captcha's in other places (and how irritating they are) and has tried to copy them.

Or am I missing a certain point?


I was about to complain that Chrome wasn't popping up the context menu on right-click.

So the .js file is disabling right click? How exactly do I look at the source without it? Hmm.

Aha: "Wrench icon" -> Tools -> View Source


Or Ctrl+U :)


Or wget (url)


Or curl (url)

Or telnet to port 80

Are we really going to list every possible method?


Well no, the point is that there are many, it's trivial to get the source of a web page. "Disabling" the context menu on a web page is nothing more than a giant billboard shouting to the world that you are childish, ignorant, and unsophisticated.


No, submitters don't get neither security nor what captcha is for.

codinghorror.com (iirc) was using "captcha" which never changed, you just had to enter "orange". Or was it tbray.org? Anyway, Atwood claims, that naive approach was 99.9% effective: http://www.codinghorror.com/blog/2006/10/captcha-effectivene...

My position is that ANY form of captcha which requires some action by visitor is broken by design and should not be used at all.

Simple captcha like this will stop most of the automattic not targeted attacs. And if someone decides to write CAPTHCA breaker specifically for this site, nothing can help—then you either degrade to the level than even humans cannot say what characters are on the screen or it is cheeper to hire Mechanical Turk to do the job.


I think that's completely besides the point. Sony actually bothered to make a couple of plain-text letters look like a captcha image just for the sake of it. That's just a clear misunderstanding of what these things are for and why they are the way they are.

As to your point, Captchas have a completely legitimate use-case that you can sometimes see on Google and Amazon. If they start to have some doubts about whether you are a human being they will present you with a Captcha image to solve. Otherwise, you should be regarded innocent until proven guilty.


Also, people are probably not going to go out of their way to tune their spambots to work on a personal blog that's just one of a billion tiny little targets, so something like a static "CAPTCHA" makes sense. But grandparent implies that this extends to the websites of large companies for which people have a lot to gain by creating specialized bots, which is pretty obviously not true.


There are two objectives for captchas:

1. Prevent prefab'd spam bots from being able to spam your site. Almost anything, including trivial questions and Sony's "captcha", works for this. Some bots will randomly try values for unknown fields, and note if a value apparently succeeds where others failed; against such bots, questions like "2 + 2 = " or "type 'orange':" are bad. A decade ago, people were writing irc bots that could answer most trivial captcha questions. A question like "Who was the U.S. President in 1993-2001" has a better chance of going uncracked.

2. Prevent a targetted spam bot from being able to submit arbitrarily many forms on your site. "orange", "3 + 5 =", 31337e^(-pii), and sony's captcha will not work for this. If there's a question pool, someone will solve most or all of them and program the bot to respond correctly, most of the time / all the time.

The idea that someone would only ever want to do (1) is flawed. If someone's bot is flexible enough, and your question pool is shallow enough, it may be worth their time to code in responses to your custom questions even if your site caters to a relatively small audience.


> A question like "Who was the U.S. President in 1993-2001" has a better chance of going uncracked.

http://www.wolframalpha.com/input/?i=Who+was+the+U.S.+Presid...


That's a pretty dishonest link there. The question/answer you linked isn't the same question posed.

When the question as originally posed is asked to Alpha, it barfs with no answer:

http://www.wolframalpha.com/input/?i=Who+was+the+U.S.+Presid...

One could argue that the leap from form one to form two of the question is trivial (and it is for a thinking human), but if such a thing were so trivial for a machine then why doesn't Alpha do it? And if Alpha doesn't do it, what makes you think a random spambot will be capable of doing it?


It seems you don't understand what CAPTCHAs are for on a large site. The purpose of a CAPTCHA is to make it expensive for automated systems to make fake accounts on your site. If the CAPTCHA requires them to hire people via Mechanical Turk to solve it then it has done it's job. If it requires a one line JQuery statement then it is a complete #fail.


My blog simply fills in a hidden value using javascript when you press the submit button, it has stopped 99% of all spam for me.

You're right, captchas aren't supposed to be difficult, they're just supposed to prevent automation.


Captchas have 2 purposes. First, prevent simple fly-by bots from spamming sites. Case in point, blog software is dominated by a few big players (e.g. wordpress) and many blogs allow anonymous comments, without something like a captcha it's a simple matter to create a spam bot that runs through a list of site's and spams ads or what-have-you in comments.

Second, captchas are designed to prevent automation entirely, including custom made automation targeted to a specific site. This sort of thing is less important for, say, blog comments since the value of a typical blog comment is extremely low. But there are lots of free accounts out there, for example, and if you use automation to set up new accounts you may be able to game certain systems to your advantage, corrupting the normal process of the market.


(And on a related note, this comment form is damn near unusable on Android in portrait mode. I had to make this a separate comment because the box wouldn't scroll to show what I was typing...)


Actually, with reCAPTCHA you can now get a prefab solution that can't be cracked.


reCAPTCHA, after extensive deployment in production, now becomes extremely difficult to solve from time to time. It just renders impossible to recognize scan images. User frustration is serious concern. If you care about conversion rate, DO NOT USE RECAPTCHA.


Yeah, one time I got an image of a sum. So I just put in the LaTeX equivalent, something like \Sigma_{n=1}^{100}, and it accepted it.


Since OCR was only able to parse one of the words, they have no way of checking the other (besides taking votes from a ton of submissions). Because the summation was not the word that was checked, you could enter anything.

It's an interesting problem. If they make it too clear that they don't know the right answer, quality of submissions will plummet. If they keep the current form, then usability suffers.


There's actually a game of this on 4chan.org/b/ there's a pattern to knowing which word is known and which isn't because the known word is put through a recognizable filter and the unknown one isn't (most of the time). They then put in a specific racial slur in for the unknown word, hoping that just once it'll slip through enough that it makes it into the final OCR'd text.


If you succeed at the control (the one it already thinks it knows) it takes your other answer as input for future viewers.


Only if multiple people give the same response, though.


If captcha system isn't reliable, a new system of deterring spam registration is in order.. maybe the system uses images to direct the user to move the mouse in a certain pattern and then use mouse gesture detection to ascertain if its human..


This is a BRILLIANT idea!!


What? reCAPTCHA is very predictable. There is one word that always looks the same. It's the word, and then the word blurred and slightly skewed and placed on top of itself. The other is a scanned word (or somethings symbols or mathematical equation). That's the bit where they're crowd sourcing book scanning.

You don't have to get it right, and in fact you can type in gibberish if you like, as long as you get the known one right, and the known one is very easy for a human to read.


I know, I know. I like the idea of reCAPTCHA to crowdsource a very hard problem for the common good. I know how it works in and out. It is awesome.

The only problem: things sound nice in theory do not always work well in practice. I ran into multiple occasions where reCAPTCHA presented me impossible to recognize scans, one extreme example is a printed texture pattern with not a single letter on it.

Your average, non-tech-savvy customers will never know how the system works. They shouldn't either, otherwise the whole idea of reCAPTCHA would have been defeated.

Judging by the volume of complaints about it on various support forums, it is really a bad idea to deploy it in production when you care even slightly about usability, user experience, or simply conversion rate.

Unless your application is a must-have.


Interesting, I'd never really given it a second thought. I appreciated the drop-in ease bit and hadn't considered its effect on conversion.



It looks like they've artificially weakened it:

> reCAPTCHA uses a “one-off” system. That means a letter in a word can be incorrect, and it will still be accepted by the system.


It is also only verifying 1 word out of the pair; usually the more garbled one.


<!-- Layer contains table with 5 cols (width is divide by no of chars) contains Captcha Chars -->

It was however, nice of them to label the spot where my breaker should scrape the characters. (Bonus points for that label being pseudo-code for my scraper)


There is a big difference between using CAPTCHA against bots / SPAM and using CAPTCHA to stop something like a brute force attack (see nice implementation by gmail, showing CAPTCHA after several incorrect password attempts. Nice compromise).


No, clownery like this should be exposed.


"orange" was codinghorror's.


Just for fun, I decided to see how hard it would be to solve this catpcha with jQuery. It's worse than I thought.

answer = $("#captchdiv span b").text();

... OMG.


you dont even need jquery: document.getElementsByTagName("tbody")[1].innerText.replace(/\s/gi,"")


For

tl;dr - http://i52.tinypic.com/2hpjg5v.jpg

;)

Edit: Disable right click ? Surely not .... http://i54.tinypic.com/2mzjcl3.jpg


I was thinking these were individual images what would be stupid enough, but this....


Heh. I took the advice, went in and checked source expecting to see the captcha string in a hidden form element.

I use NoScript and Sony.com is not on my whitelist (read: I could just select/copy the string).

This was worse than I imagined.


Someone must have caught wind. No need for right click with Firebug. :)



Pure fucking gold:

<td width="34" align="center" valign="top"><span style="font-family:cursive; FONT-SIZE:13.2 pt; color:#FFFFFF; text-decoration:none;"> <b>P</b></span></td> <td width="34" align="center" valign="bottom"><span style="font-family: cursive; FONT-SIZE:13.2 pt; color: #FFFFFF; text-decoration: none;"> <b>U</b></span></td> <td width="34" align="center" valign="top"><span style="font-family: cursive; FONT-SIZE:13.2 pt; color: #FFFFFF; text-decoration: none;"> <b>W</b></span></td> <td width="34" align="center" valign="bottom"><span style="font-family: cursive; FONT-SIZE:13.2 pt; color: #FFFFFF; text-decoration: none;"> <b>W</b></span></td> <td width="34" align="center"><span style="font-family: cursive; FONT-SIZE:13.2 pt; color: #FFFFFF; text-decoration: none;"> <b>Q</b></span></td>


This is an excellent example of cargo cult thinking. If we build the airports and control towers that look right, the cargo will flow!


On the bright side, users with terminal browsers, like Lynx, can enter the captcha text :)


I hope Sony is building a centralized IT division after the PSN outage and all these embarrassing websites will be gone soon. If they don't then their CEO should be fired for ignorance.

I would be interested in the internal structure that leads to these incredible unprofessional results, there has to be something fundamentally wrong.

I make a guess, because when you think about it, this whole thing is generated with css and html. It is so funny that with all these horizontal lines it looks like a real Captcha but it isn't. This was just built to satisfy some manager who didn't accept/know that their team isn't trained in image generation techniques. There was some deadline who needed to be satisfied and they were forced to do this.


I seem to recall a comment somewhere from a Sony employee about how they already have one, it's just that many websites are built by the individual product teams (often marketing).


Or just cut and paste:

D F T L F

And they use a goofy font that is hard for humans to read but would be trivial for a machine to recognize, if it actually had to. Too funny.


It's funny because this captcha probably required more work (disable right clicks, render tables, etc.) in contrast to a proper one, i.e. assuming they were allowed to use an image generation library.


They just don't get the right people to do this kind of things, I used to work for Japanese company, for me it's seems like they don't consider skilled IT worker is important.


All CAPTCHA's are nothing more than a cat and mouse game where the mice are /always/ faster than the cats.

The point CAPTCHA developers always seems to miss is that you must render the visual to the screen, or the audio to the speakers, so that the end user can process the information and pass the test.

These tests, by design, are finite in the case of human generation or algorithmic in the case of random text / pictures.

No matter the case, automation developers have more available options to pass the tests than the CAPTCHA developers have to generate. Why?

Because an automation developer who deems an applications data valuable enough to acquire will not rely on technology alone to solve the problem. If they are unable to develop an efficient and reliable technology to pass the test for the intended application they will employ a semi-automated approach that involves real live humans.

If the automation developer outsources this semi-automated component there exist services who employ hundreds of individuals per shift to do nothing but solve CAPTCHA challenges through an API with the automation developer's system. These services cost less than $2 per 1,000 /solved/ CAPTCHAs. If the automation developer is of any stature they will have their own facility and the price comes down to $1.50 or less per 1,000 /solved/ CAPTCHAs.

The feedback loop for CAPTCHA developer is plain broken. The security mechanism is designed to validate that the request is from a real human and the automation developer, when presented with a technical challenge not worth technically innovating around, will just employ low cost real humans.


Human labour is thousands of times slower and thousands of times more expensive (even at $2.00 per 1000) than automated solving. The fact that spammers have been forced to use such inefficient solution techniques is proof of the captcha technique's efficacy.

Consider if these captcha's were not in place. By the above logic, there would be thousands of times more spam on sites like Google and Facebook.


While I agree that human labor is thousands of times slower and thousands of times more expensive than an all computational solution that is not the point most of the time for automated systems.

In the case of messages and wall post on Google or Facebook, once a CAPTCHA is solved you get to send N messages before the next one is popped up. Without the CAPTCHA you could only go so fast anyway due to general rate limits so the impact on spam, for either service, is minimal.

In the case of account creation in general the limit for automated systems is more heavily dependent on diversified proxy access more than anything else. Additionally, you can only effectively manage so many accounts at one time depending on your needs and CAPTCHA do not add more than a few % time delay to the overall account creation process.


Though the semi-automated systems using humans can break CAPTCHAs, there are still scenarios where using it makes sense. For example, to prevent DoS attacks, a CAPTCHA might be employed to route visitors from one part of the site to another. Even if humans are involved in this loop, the delay introduced in solving the CAPTCHA can slow down the attack.


In certain circumstances CAPTCHAs can be effective for preventing DoS attacks. I will concede this point.

However, I will stand by my opinion that the majority of CAPTCHA implementations, no matter how sophisticated, are generally ineffective at preventing what they were implemented to prevent.


Does anyone have a link to a page where this is actually used? It seems too ignorant (even for Sony) to contain the actual captcha value within the source.


It's used on this page here:

http://pro.sony.com/bbsc/ssr/mkt-security/support.form.bbscc...

Which redirects you to a "down for maintenance" notice.


Very ironic that this is on a site for "Sony Security"...


Yeah Lame... now I get why that guy alone could hack the PS3 unhackable lock.


Heres the problem with sony's captcha. If you are dealing with a small site that nobody gives a shit about, asking users to answer "what is one plus two" and ALWAYS accept an answer of 3 is enough. If you are sony, you need better because there are many visitors and thus its an issue. I implemented a honeypot/time analysis/js assembly tool in a few hrs, im sure sony engineers can do better in a day. But I guess not since sony is not exactly known for its good web developers.


I wonder if this is the result of divvying up the work and outsourcing to various groups? i.e. no individual group knows what is going on, they just have certain objectives they must do in order to get paid. Such as, "Create a web page with 5 random letters that a person needs to type in order to get to the next page..." Perhaps "CAPTCHA" wasn't even in the design description?


No reason for this to be downvoted


I use captchas that set a hidden field value through javascript (the user does not enter anything). It takes a second to view source and figure out what should you enter. Despites this, the captcha works 100% and spam stops. Unless spammers target your own site manually, such a solution is good enough.


How about a version that asks users to enter highlighted text out of a short paragraph?

The portion of text that would be perceived as highlighted would be subjective enough to avoid non-specific, casual attacks, but wouldn't sacrifice readability or ctrl-c-ctrl-v.

e.g. red adjacent magenta vs. magenta adjacent blue


I like the 13.2pt font - 12 or 16 would just be too obvious. But 13.2? Now that is security!


I'm horrified every time I see this.


The point of the Captcha is to block fraud and bots. If it is good enough to do this then who cares how it is implemented.

Obviously it is useless against a bot written to spam the specific sites this is on but I assume it will stop a generic bot (eg forum spamming one).


"Come on, thats good enough" is probably the root of most security issues ;)

"Sorry. Security is not optional". -- A kde developer whom I cant remember

Edit: added the quote.


Spam prevention is not a matter of security. It is completely orthogonal.


If it's supposed to stop bots why did they make it harder to read for humans than bots?


That cannot be for real? Anyone know where this is used in context?


This is not really a security issue, it's an anti-spam issue.


Right click works in chrome 12.0.742.122 / os x and when you right click on letter x it comes up with an option "Search Google for 'x'"...


That's the funniest thing i've seen all week


I once encountered a

<img src="captcha.cgi?v=ABCDEF">

I laughed a lot.


They only disable right click and selection and they think people will never know?


"Browser security. It does nothing!"


IBM Software Development Platform, what development tool they use?


Didn't someone already share this like 2 weeks ago?


So this is why PSN was hacked? Were the user passwords also served to users in plain-text and evaluated in Javascript?


wow... that is really bad


Unbelievable!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: