Hacker News new | past | comments | ask | show | jobs | submit login
There is no evil like reCAPTCHA (thestoic.me)
557 points by eitland on Aug 5, 2019 | hide | past | favorite | 284 comments



Google did not create reCAPTCHA. They bought it; it was started by Luis von Ahn, who went on to create Duolingo.

When reCAPTCHA was created, the alternative was CAPTCHA, which tried to impede bots but did not generate any social benefit. This was the genius of the original reCAPTCHA concept: the time taken to 'confirm humanity' could be channeled into the socially-useful endeavor of digitizing books. Capture some of the heat emissions of impeding bots for a useful purpose, rather than letting it all go to waste.

Now, yes, Google is using it to train their self-driving car AI, and there's a bunch else happening in it to connect to Google's surveillance apparatus. There's much to legitimately criticize there. I personally don't view training Google's proprietary AI as the same kind of intrinsically altruistic purpose as digitizing the world's pre-digital books.

But putting the entire concept on blast with erroneous history that can be corrected with about 60 seconds on Wikipedia doesn't help the argument at all.


I’m hereby inventing a new rule called Chesterton’s Wild Boar fence, the essence of which is people who don’t have gardens, or don’t hang out at night, would always complain about wild boar fences, as they lack any awareness of the beast and its damage, or they downright believe its mere existence is a myth.


I second this notion. I've been using sharks as an example: Sharks aren't dangerous when swimming, because we don't swim in deep water, because sharks are dangerous. Shark attacks in deep water beaches are fatal at roughly the same rate as riding a bike without a helmet. [1]

The risk of sharks is tempered by our experience with them. Few people swim in deep water beaches (because they have signs saying "Danger! Sharks!") And those that do typically take appropriate precautions and maintain awareness.

Sharks don't want to eat you and do quickly let go of swimmers they attack, but that's irrelevant because the damage has already been done. When I was young, a large amount of education was put into stating that shark attacks were rare, and it's true both in absolute numbers and by comparison with how feared they are. Jaws and knockoffs spread irrational fear in the 70s/80s, and my early 90s childhood came with the counterpressue there, but that counterpressue caused many in my peer group to misunderstand the risks. Shark attacks drop off hard to zero if you're swimming in shallow water. Even at 10 meters, which is not uncommon for surfers, they are a real risk. But surfers spend little time at 10 meters out. All of this forms a balance.

[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3941575/


I think this can be summed up with a single word: incomprehension. Or in the latter example: ignorance


I've been toying with this idea under the general rubric of "manifestation", that is, the distinction in understanding of things which one has experienced manifestly -- directly -- and those one has not.

This covers any number of circumstances -- why travel is broadening, why rare / degenerative / mental health conditions are so frustrating to explain to healthcare providers / family / others, trying to communicate specialised knowledge, historical bias, what the old know that the young do not (and rarely, vice versa). Tacit vs. explicit knowledge. Theory vs. experience.

There's probably far better existing terminology than what I've come up with (Hume, Kant, and Berkeley address this, as does Plato, within philosophy). But it's also a major concern in a highly diverse yet tightly interconnected world.


Saying it "impedes bots" is a little generous; it impedes humans as well. Or rather: it works on a spectrum where bots are at one end (fully obstructed), easily tracked humans at the other (free entry), and humans who disable tracking devices and/or eschew Google services somewhere in the middle (allowed to pass after much hassle).


It definitely impedes Firefox users as opposed to Chrome users.


Yes it's the only reason I can't use Firefox mobile without having to fire up Google chrome from time to time as a lot of sites block ff but not chrome thanks to the stupid Recaptcha


To be fair, having a tracked history does make it easier to prove that you are human.


reCAPTCHA generally only works for me after I do it 3-4 times. Purely because I use a VPN. reCAPTCHA v3, curiously, works just fine when I'm using a VPN (if I allow it to run in the first place).


Recaptchas audio option for the visually impaired (the headphones icon) is relatively very fast compared to clicking store fronts. Recaptcha will sometimes deny me the audio option, for unspecified reasons which I venture to guess could be challenged in us courts under the ADA (disabilities act.)


Yep, it was originally run by Carnegie Mellon (as you mentioned, by its creator Luis von Ahn and others).

This article also doesn't seem to touch on the newer reCaptcha that tracks you everywhere on a website (you'll notice a little blue box on the bottom right with the logo where this happens), not just on login or user input pages.

There is a lot to criticize about reCaptcha, including privacy concerns for sure, and there were some other posts about it on HN before.


So basically it has been perverted and acts in ways that harms people now, like most things Google touches.


I wonder if in some jurisdiction Google shouldn't pay money for forcing people to train their AI. I imagine it could be possible to do that in Germany, or under some EU laws.


Perhaps in Germany Google can charge for their services, but waive the fee if they solve captchas.


Am I the only person here who always entered absolute nonsense for the scanned word? The original reCaptcha had two words, one which was clearly generated and another which was clearly scanned - to "solve" the captcha all you needed to do was to enter the generated word correctly, the other could be literally anything. So I always entered banana or something similar for literally everything.


You're not the only person -- I have a friend who did this, also generally inserting silly words in the side he guessed was scanned from a book.

I have a hunch that von Ahn knew this would happen and the same scan is shown to multiple users before a word is chosen.


reCAPTCHA v2 blocks [1] people with disabilities from accessing basic services on the web, such as registering to vote, paying utilities, filing taxes, or accessing medical services. This practice is likely illegal, and the sites which facilitate it may be legally liable.

reCAPTCHA v3 has no user interface, it only returns a score upon which the site operator can act, often delaying or blocking access [2] to services. In this case the responsibility falls entirely on the site, while Google is no longer at risk of being found liable for the damage caused by its discriminatory service.

reCAPTCHA v3 works best when it is embedded on every page of a site [3]. The service collects detailed interaction data on every website you visit which has implemented it. The extent of tracking is similar to Google Analytics, but you cannot block it, otherwise you lose access to large portions of the web.

The collected data is highly sensitive, it not only contains your browsing history, but a detailed snapshot of your actions on sites. Mouse movements can reveal health issues which affect your motor functions, and your interests and desires are laid bare based on how you interact with content.

Google must be compelled to disclose in the reCAPTCHA privacy policy what data is collected and how is that data used. Journalists have asked Google for years to clarify how the data collected by the reCAPTCHA service is being used, and their answer is always the same: we only use your data to provide the reCAPTCHA service, and it is not used to personalize ads.

The problem is, those are just words from their PR department, the legally binding documents are the privacy policy and the terms of service. reCAPTCHA uses the same privacy policy like the rest of the Google services, which gives them the right to use your data for ad personalization.

You must resist against adding reCAPTCHA v2 and v3 to your sites. There are alternatives [4] which could offer the same level of protection for your services, when used the right way. Their implementation may not be as convenient as reCAPTCHA is, but that is the price you must pay to prevent Google from mining our personal data and our every interaction on the web.

People are forced to hand over their personal data to Google at all times, otherwise they face losing access to services, and being excluded from societal processes that are increasingly happening exclusively online.

This is where privacy rights and human rights are violated, and it is upon all of us to make our voices heard, so that exisiting legislation is enforced, and new laws are put in place to prevent companies from abusing and exploiting us.

Handing over our data to Google must not be a condition to fully participate in society.

[1] https://github.com/w3c/apa/issues/25

[2] https://news.ycombinator.com/item?id=20295333

[3] https://developers.google.com/recaptcha/docs/v3

[4] https://www.w3.org/TR/turingtest/


Couldn't one use u2f as a captcha alternative, obviously without information about the stick itself, only the batch attestation, and then throwing the registration in the bucket? After all it does need an interaction in the meatspace and sure a bot could be engineered to trigger it, but you can't just relay the challenge somewhere and have someone else clear it for you and even if you have a lego construction or whatever to clear your captcha, it's FAR slower than having many people on a solving service help you.


exactly. google should be banned from all online public services of any kind, since they can't be avoided. it's unreasonable to expect people to shop around for a town to live in that doesn't, and never will, use privacy-invading google services like recaptcha.

i'd even support a ban for other core services like utilities and banking that may not be public entities.


This brings me a thought: What if I someone created a service that channels Amazon Mechanical Turk tasks as CAPTCHAs so that you(r website) could make a buck of those people solving captchas?


They did the same thing to collect walking data using the game Ingress, and Goog-411 to collect voice data.

Well specifically, Niantic, which was a google internal thing.


I could swear that close to the launch of Ingress some Niantic employee said something along the lines of, "We're not actually collecting much data. It's all secretly an evil plan to get nerds to exercise," but I can't find a source.


I believe you, but I'm also confident that your quotation was said in jest.


Could an organization do the same thing, but as a non-profit with open datasets? This way everyone benefits.


> but putting the entire concept on blast with erroneous history that can be corrected with about 60 seconds on Wikipedia doesn’t help the article at all.

Nor does an entirely fallacious premise. ReCAPTCAH v3 is entirely transparent and non invasive to users. In fact it’s retroactive to help the site admin figure out what to do with the score:

https://developers.google.com/recaptcha/docs/v3


>ReCAPTCAH v3 is entirely transparent and non invasive to users

Except when you don't opt into google tracking you by blocking third party scripts, in which case your life still gets to be hell.


> non invasive to users

... who are opted into fully and completely to all Google tracking and have previously participated in Google's ecosystem.

Pretty odd definition of "entirely fallacious".


That's what they say, but it rarely works out that way for me. And yes, it's probably because I always block all tracking, and use VPNs and/or Tor.


> non invasive to users.

Except for the invasion of my time and attention, used to train Google's AI to get better at recognizing traffic signals. I took that as the main point of the article.


That's v2.

v3 is "invisible" and is supposed to be deployed to every page on the site, and the site is the one who decides how to punish you for not matching their normal audience.


Not just invisible but unlike "invisible recaptcha" which was kinda between v2 and v3 which does spawn a challenge on its own, but v3 is entirely non interactive and as you said the site/admin decides the punishment.


Isn't the author talking about recaptcha v2? My understanding of recaptcha v3 is, that it just gives you a score for how well google can track you and then leaves it up to the site operator to block users who aren't transparent enough.

What I really hate about recaptcha v2 are those artificial delays before loading the next image (which it can happen to several times on a single card). And then in the end you frequently fail, despite answering everything correctly.


> Isn't the author talking about recaptcha v2?

Yes.

> My understanding of recaptcha v3 is, that it just gives you a score for how well google can track you and then leaves it up to the site operator to block users who aren't transparent enough.

Yeah, that's the one where you're supposed to put it on every page of your website so that Google can collect more information on your users. If they can't, they'll return a low score that you'll use to mark users as "bots".

> What I really hate about recaptcha v2 are those artificial delays before loading the next image (which it can happen to several times on a single card). And then in the end you frequently fail, despite answering everything correctly.

I think this is just what Google does when it thinks you're a "bot": i.e. they don't know who you are.


If you have strict privacy features enabled in firefox they block recaptcha and force you to do hard puzzles. It really sucks, but I'm not disabling the privacy features.


> I think this is just what Google does when it thinks you're a "bot": i.e. they don't know who you are.

I do wonder why though. For a long time, I assumed it was a rate limiter, but then another HN commenter pointed out to me that time is more valuable to humans than bots. Bots can work on multiple captcha's in parallel.


My thoughts exactly. A bot doesn't care if it takes 2 seconds to fade the images out and in for another challenge round, but a human viewing it perceives it as a frustrating delay.

I've become accustomed to just closing any page that presents me with a v2 reCAPTCHA.


Unfortunately one of those pages was the Equifax settlement. And other similar “important” sites. I never seem to come across them when it’s a service I could easily quit or avoid.


This might not be so sinister. It so happens that "important" operations, like a class action lawsuit settlement, are also the type of thing you'd particularly want to protect from bots.


Why would anyone protect an "unimportant" site with a captcha? The value of the information, to someone, in bulk, is exactly why some throttling is needed.


Good CAPTCHAs are solved by farming them out to low-paid workers, not bots.


Why? Maybe Google hopes that if reCAPTCHA is sufficiently annoying and prevalent that users will disabled any and all tracking extensions they've enabled?


Or maybe they notice that they don't get the same problems on Chrome, so they move to that...


Probably neither explicitly as a human judgment, but both constructively - some A/B experiments noticed favorable numbers went up by doing the crappy thing, and so it has been decided.

I'm waiting for the day when it pops up "find the humans" and there's someone clearly wearing CV-camouflage. For anyone that doesn't get it: you let that human hide, because it could be you in twenty years.


A lot of bots use humans to solve captchas, so the human time is important.


The bot is delegating what it presents to the human, though. So just present the human with images that have already faded in (and if no image has finished fading in, switch to the images from the other captcha).

Edit: Unless, it occurs to me now, Google is monitoring how the user responds to the fade in. When in the fade process do they click the square, for instance.

Still seems like it wouldn't be all that helpful, though.


> I think this is just what Google does when it thinks you're a "bot": i.e. they don't know who you are.

Maybe this is to prevent adversarial learning? If the images reload immediately, then the bot can learn (via a neural network) whether its solution was good or not. If there's a delay, it's the same, but the learning is slowed down by the same factor.

No idea if this is true, it just popped out of my head.


As I said below, this is what I thought too, but another HN'er pointed out that a bot could load up a thousand captcha's in parallel and just switch between them while waiting for the fade in.


Sure, they can, but putting a small delay forces an arms race on how big a cluster you need to operate those parallel bots and how cheaply you can offer a cracking service.


Google's pretty well done with the reCaptcha or are counting on a lot of sites to keep using v.2 for a while.

There's more saleable data to be collected tracking people's interactions in a website, under the guise of predicting who's a bot.


That's correct. ReCAPTCHA v2 is the one where if your opaque, invisible ML humanity score is below a threshold it makes you squint at weird little photo squares and try to figure out whether the part of a bumper at one edge is enough to count as a car. ReCAPTCHA v3 is the one where if your opaque, invisible ML humanity score is below a threshold the website just breaks in unpredictable ways until you either give up or (for highly technical users) notice that ReCAPTCHA is at fault and interact with enough Google services to raise your score before trying again.


> or (for highly technical users) notice that ReCAPTCHA is at fault and interact with enough Google services to raise your score before trying again

Or, for absolutely everyone: pretend you're blind and click on the audio captcha option. It takes seconds to solve and is trivial 90% of the time :)


ReCAPTCHA v3, the version I was referring to in what you quoted, doesn't have an audio CAPTCHA or any kind of fallback option at all, which is part of what makes it so much worse than v2. (The other part is the privacy implications of adding thorough behavioral tracking to every part of a website.)


V3 actually doesn't have ANY interaction in the first place.

It just scores you and the dev can punish if needed

https://developers.google.com/recaptcha/docs/v3


Curious. Do you have any example of a site using it? I noticed that reCaptcha has become much, much worse recently and I assumed that this must be this v3, but I've never encountered one that would actually have no fallback and prevent me from using the site.


Adidas uses it on yeezy releases when you stand in queue on each page refresh.

Can't think of a "normal" website using it.


Won't work if Google doesn't trust you enough. Which is, by the way, also a major problem.


I delete all cookies outside of a few whitelisted domains every time I close a tab, which is why reCaptcha is so annoying to me in the first place. Still, at least for me, the audio fallback is always there and much less annoying.


Tried it a few times, it just told me something to the tune of "this option is not available", with no button to return to the picture-solving option.


How does that help when Google has decided you’re a bot and you fail no matter how many challenges you solve?


Never happened to me. I've been getting "point out the hydrants" over 5 times in a row sometimes because of the amount of anti-tracking I have set up, but the audio option was always there letting me on the first try.


So, you’re saying that everyone should be happy because you’re ok?


No – I'm saying that I don't have an answer for you if I never encountered what you're describing.


No. It gave me like 17 attempts before it realized that I wasn't a bot.


But if you fail on v3, don't you get v2? For example, I constantly get v2 when using a VPN. It's terrible.


Not directly. The dev decides on your punishment but sure most likely throwing in a v2


It's such a shame that this poor article is getting attention, given that reCAPTCHA v3 is actually a terrible privacy violation.

Sadly, I don't think the author knows the difference between v2 and v3. The article is definitely not talking about v3.


Yes, I thought the article would be about privacy concerns. Instead it was a rant about v2, not mentioning v3 at all (except for mistaking v2 for v3). Ultimately there was little substance here.


The article really should have focused on this, it's the only valid concern. There's no proof or references behind the later claims like "THE AVERAGE TIME IS OVER 30 SECONDS!" and " It doesn't matter if you're logged into your Google account and allowing all manner of cookies.", not even anecdotal or personal videos of trying to solve a recaptcha like this post[0] had from two months ago.

> I don't think the author knows the difference between v2 and v3

The author doesn't seem to have ever visited google.com/recaptcha to know what Google refers to as different versions of its service. The author instead is talking about "versions of bot detection", ie. The "detect human mouse movement" was v2, "scan your Google history to see how human-like your activities are" is v3, and the author envisions "v4" as doing the silly things from the latter half of the article.

0: https://news.ycombinator.com/item?id=20158386


A lot of articles posted on hacker news are by people who don't really understand what they're talking about, but use a lot of emotionally charged language. A lot of times this is fine, because it's articulating something that bothers a lot of people and wastes their time. But in this case it's absurd. I don't think he understand just how much crap bots were/are putting out there, especially before we had good services like cloudflare and the whole web hosting cloud to filter it. It's not Google's fault every small/medium sized company decided to put important customer services on a web portal which could be easily abused, they just capitalized on it. Sys admins can't stop random executives from making crap decisions like having an inherently insecure system, but at least they can pitch on reCaptcha to fix it a tiny bit. But they wrote an article where they used a lot of CAPS LOCK, so they must be an authority.


> A lot of articles posted on hacker news are by people who don't really understand what they're talking about, but use a lot of emotionally charged language.

This seems to be the best strategy for getting to the top of the front page these days. Even better if the target of the emotionally charged language is Facebook, Google, or Amazon.


Here is my earlier comment when somebody was bashing reCAPTCHA

https://news.ycombinator.com/item?id=20297764

- Running a high profile, or even low profile. service which attracts automated or spearhead attacks makes you appreciate reCAPTCHA

- Web services and users are often low value and reCAPTCHA offers a free medicine

- Cleaning up attacks and such as a devops/webmasters is pain in the ass - getting all those alerts ad Saturdat 11:00pm in a bar - you do not want to cover them from your $100 budget

- reCAPTCHA makes many problems go away for a service provider

- People complaining about reCAPTCHA are often low value users (they do not buy anything) - though I have only subjective point to confirm this

Long term solutions can be only moving away for CAPTCHAs to strongly authenticated humans by a trusted party

- Strong human authentication on every service controlled by Apple/Google/Facebook who has vast data to keep bots in the check

- Start paying for the services - though you still need to do CAPTCHA at least once in the card authorization to prevent cardsters

Alternative for reCAPTCHA - though I do not vouch in for the quality yet: hcaptcha.com/

Bonus: Micropayments instead of ads or make botting too expensive - welcome to cryptocurrency land


The California DMV website uses google recaptcha for online services.

That's a government function that's hard to apologize for.


We had an alternative to reCAPTCHA for a moment - proof of work coin miners, when properly set up it is also very privacy-friendly by having the possibility of fully self-hosting it. Yes it had it's downsides, but IMHO less downsides than doing work for Google. Same goes for coinminers instead of adsense.


No you didn’t. A few seconds of proof of work is trivial for a bot, any amount that makes it hard for a bot is unusable to a human. This is why nobody uses hashcash.


The same way you can buy captcha solves, it's trivial for a bot.


v.3 allows you to dial down the threshold if you start blocking too many humans.

Bots usually fail in the 0.0-0.3 range, so you can run it with a threshold around 0.7-0.8 and most people won't even notice it. Shame about the gross privacy invasion but it's probably not much worse that running Analytics?

It will kill your site speed score on Lighthouse for mobile.


With some of these I'm left debating what is a traffic light. Is the pole itself a traffic light? I really don't know.

Also when it asks to select all the cars. Is a bus a car? Is a truck a car? I really don't know what it is expecting and I must pick wrong as I often fail.


Perhaps that's Google's intention. They are using your human intelligence to make educated guesses, rather than using AI. Your choices will be added to their dataset.


Yes, but my guess would be much more educated if they would tell me the question in sufficient detail.


The worst part is the self doubt afterwards when you get presented with ANOTHER reCAPTCHA, leaving you questioning whether you got something wrong.


It's very frustrating that there's no "your AI is an idiot, there isn't one" button. I saw an example of one that said "click the tiles containing the bus" where it was clear what the AI thought was a bus, but it definitely wasn't.


This reminds me of Janelle C. Shane's work on AI[1]. Computer vision's been learning based on the photos that people take, not the visions that people see, and this is giving AI a preference for photogenics instead of reality.

People generally don't take pictures of rolling green hillsides. But they very often take pictures of rolling green hillsides with sheep on them. So if you ask the robot to draw a picture of rolling green hillsides, it will include sheep. Or, if you ask it to draw a picture of the savanna, it will want to include giraffes.

Now you're being asked to find a bus in a photo without a bus because it's a street scene, and every street scene has a bus in it.

I haven't read her book, yet, but her Twitter[2] is often full of amusing anecdotes like this.

[1] https://aiweirdness.com/

[2] https://twitter.com/JanelleCShane


> People generally don't take pictures of rolling green hillsides. But they very often take pictures of rolling green hillsides with sheep on them. So if you ask the robot to draw a picture of rolling green hillsides, it will include sheep. Or, if you ask it to draw a picture of the savanna, it will want to include giraffes.

Don't humans do this too, though? If I asked someone to draw a picture of rolling green hills, they may well add sheep as an additional detail.


Well, personally, I had the "bliss" image from the Windows wallpaper collection in my head while I was writing this. But I'm cognizant that most of the photos of hillsides I took in Ireland had sheep in them.


I wonder if that would still be true, though, if "bliss" wasn't a default Windows wallpaper. In other words, you're still referring to common pictures, you're just particularly biased to one in particular that you've seen a lot.

I haven't done the experiment, but I'd posit that if you walked up to a group of 8-year-old children, gave them crayons, and asked them to draw pictures of "rolling hills", a significant portion would add sheep, cows, flowers, or some other details—even though a majority of rolling hills in the world don't have any of these features.


Thank you, I ordered her book on the strength of your comment!


I wonder if you ran into something I deliberately mislabeled as a bus.

Your goal shouldn't be to answer the question earnestly, but to confirm the machine's biases. Going with the flow is expedient, as well as giving Google less support.


In this case, it was obvious what it wanted someone to do.

It gets more frustrating when it's less clear. Is "click the hills" with a picture of a mountain a mistake, or a trick? Should I click all the tiles that contain a bus, even if it's only one pixel, or should I only click the ones that mostly contain a bus? etc.


The most frustrating is “select all pictures with bridges” when not one picture contains more than one bridge.


are you joking?


No, it's pretty regular. It will even tell you to "try again" if you correctly don't select anything.


They probably assume that the vast majority of the population are not massive pedants and will do what they expect.


They probably don't assume anything, and it's just a corner case in their automated system.


Right, but it's why fixing it is not a priority.


That is, they assume the vast majority of the population is indistinguishable from their AI. I believe that view also informs their customer service procedures.


I think it's pretty clear what it's asking for and you are misinterpreting the request.


I actually thought I experienced that this morning. But then I looked closely, and there were actually photos containing sections of what I was left to assume were actually bridges. So I think you're technically wrong, but the spirit of your comment is correct!


I always see complaints about recaptcha like this, but I've never experienced this struggle, certainly not to such a degree as to be outright frustrating. It's always seemed pretty obvious to me what the right answers are. My personal take to your examples are that that a traffic light's pole is not a traffic light, but any part of the box that contains the lamps/leds is (even if facing away from the camera), and that neither large trucks nor buses are cars, but would include vans, pickups, and SUVs.

Hopefully these rules of thumb will help someone reading this find these captchas less frustrating.


I’ve had a blog for about a decade now - I wanted to host it myself rather than use some blogging infrastructure, since I was afraid that blogging infrastructures would become evil in the future (see: Medium). I also wanted a comment section, but as soon as I turned it on, I got hundreds of spam messages every. single. day. I shut it down and started working on adding captcha support. My custom captcha solution did pretty much nothing to slow down the spammers, and I realized how much time and effort I was going to have to spend developing and maintaining it… I decided to hand the reins over to the fine folks at Google, instead. If they want to recoup their costs by having visitors on my site spend a few seconds training self-driving cars, that’s a trade-off that’s worth it. It’s also worth noting that the article doesn’t actually point out anything that’s particularly evil - just points out the potential for some evil in the unspecified future. Although I agree that the best time to tackle evil is now rather than later, I also don’t see any alternative for recaptcha, so I’m going to continue to use it and hope for the best.


I have a small website. The spammers found it quickly. I added a simple form field eg "name" that was hidden with css. If filled in, the request was rejected, but with a 200 response. Spam disappeared instantly.

I did find out later that some legitimate users were getting rejected due to auto fill or something.


Perhaps the wordpress method of comments would work for you without a captcha - you allow comments but hold for review any comment that has a potential link in it. Most spammers skip posting links in WP comment sections unless they specifically are trying to market to the site admins.


No they don't, WP sites left unattended this way will amass millions of posts pending approval


> human right of mental comfort

Hahahahahah hahhaha ahhah hah. Lol.

I'm sorry, I just can't help myself. This cultural tendency toward naming everything you prefer as a right is just... It's hilarious to the extent it's not just sad. I'm willing to give the author the benefit of the doubt that it was just hyperbole. In which case, bravo.


That was the exact point when I decided this article was a waste of time. What a ridiculous statement. Even if it was meant to be comedic hyperbole, the effect is that it's impossible to take seriously anything else the author has to say.


Well I suppose "Hahahahahah hahhaha ahhah hah. Lol" is a better example of the kind thought-provoking serious writing one should aspire to consume.


To be fair, the comment included more significant content than that (unnecessary) outburst.


Yeah I'm not sure whether to flag the comment as an unconstructive personal attack on the author or if it's just within the lines of what we like to see in this community.


If it's marginal enough to give you pause for more than a few seconds, down-vote it. If HN becomes too dry and technical there's always Reddit...


For those of you asking for alternatives, I've tried a few of the things posted in https://kevv.net/you-probably-dont-need-recaptcha/ to some success, but obviously it depends on your use case.


thanks for your overview !

Does anyone knows a recapthca2-like service/software, that would help solve such tasks (OCR, object recognition..) on a custom provided dataset ?

It could provide an alternative to recaptcha, and a "Mechanical Turk"/crowdsourcing for universities, institutions or companies to help solve some of their repetitive tasks.


My two main frustrations with reCaptcha:

1. No (or at least piss poor) localisation. It asks me to locate English words, sure, but the images are of things familiar only from American films - sorry, movies. ReCaptcha is how I know (and my only use for knowing) what a 'crosswalk' is.

2. Sometimes it's just wrong. But I have to select the images that it incorrectly thinks is a bridge or whatever anyway, otherwise I'm not allowed to login.

Everyone's one of N top complaints:

- How much of the damn structure counts as a traffic light?!


Yep, they've trained my neuronet that they always show the images in a 3-2-1 or a 3-1-1 sequence, so now even if shows me something weird, I know I need to select the number of most likely things they want me to. This way I almost always finish it on first try.


Would appreciate thoughts on this alternative that we are developing:

https://github.com/librecaptcha/lc-core

The idea is to develop a framework for Captcha generators. A few sample generators are provided out of the box, but new ones can be written easily. The framework takes care of storing entries in the database, serving them as challenges through an HTTP API, and checking the responses.

From the README, why libreCaptcha:

  * Eliminate dependency on a third-party
  * Respecting user privacy
  * More variety of CAPTCHAs, tailored to your audience
The implementation has a long way to go (it was written by students trying to learn Scala), but would appreciate thoughts on the concept.


If you give me a captcha generator I can give you a captcha solver. You have no idea


Agreed; a determined programmer could solve almost any Captcha given sufficient time and resources.

But we are not trying to create an unsolvable Captcha. For those websites that need something good enough to deter generic bots while not compromising privacy of their users, this might be a good enough alternative to reCaptcha.

Imagine a docker image which just works with out-of-the-box generators. Those who need more variety could create a custom generator with Javascript and drop it into a docker volume.


e.g. the difference between a general intelligence and the AI we have today. Generators that vary what is being asked from site to site make generic solutions much more difficult.


I love competition, but I hate Google.

So I wish them luck while hoping that they’ll go die in a fire.

Manifest v3, tracking everything, this... Only validates my decision to scrap all Google services 2 years ago.


I can’t get behind this google hate. I can think of many companies that aren’t providing any sort of public good.


Recaptcha is a free mechanical turk for Google. Right now they re training waymo's cars, but soon they 'll need to train other networks. Prepare for "Click on all images showing intraductal papillary biliary neoplasm". C'mon, don t be lazy


This conspiracy theory has been debunked many times.


What conspiracy theory? That's what Google say it is for:

> Hundreds of millions of CAPTCHAs are solved by people every day. reCAPTCHA makes positive use of this human effort by channeling the time spent solving CAPTCHAs into annotating images and building machine learning datasets. This in turn helps improve maps and solve hard AI problems.

https://www.google.com/recaptcha/intro/v3.html (under "Creation of Value - Help everyone, everywhere - One CAPTCHA at a time.")


The conspiracy theory of "google is doing what they promised to do"? Half the point of recaptcha was classifying images.


Just wondering but does this explain why the images always seem to be driving related like traffic lights, crosswalks, bridges, etc.


The title mentions reCAPTCHA v3 but then the article goes on to rant about the challenges you have to solve. reCAPTCHA v3 is completely automatic, there are no challenges at all. Of course, it's not perfect either, because occasionally actual humans fail the challenge too.


> reCAPTCHA v3 is completely automatic, there are no challenge at all.

Yeah, if you're signed into Chrome and Google has enough information to know who you are (yes, I know the new one is score-based…this just means that anyone who isn't the above is going to get a poor score and be blocked.)


You can actually set the minimum score requirements to whatever you want. The default is 0.5, but there's nothing stopping you setting it to 0.0 and letting everyone through.


Can you set it to only allow people with score < 0.5?

No, I'm not joking.


The challenge is looking at your privacy and deciding it's not worth a few seconds clicking on store fronts


reCAPTCHA v2 has prevented me from doing so many things. A recent example: I attempted to make a LinkedIn account. When I got to the captcha, it gave me one of the really nasty ones where you wait for each new image square to slowly appear, and almost every new square also contains a bus or whatever it wants me to click. Then after hitting submit, it still thinks I’m a bot, so I get sent back to another captcha, infinitely. I never ended up creating the account. I attribute this to me using ublock and not using google products, but I can’t say for certain, and I’m not changing that just so I can pass a captcha. One of the stated purposes of reCAPTCHA v2 is to prevent actions from taking place altogether by wasting a user/bot’s time, so this was probably one of those cases. Somehow it felt certain I was a bot, so to prevent the signup from being possible it instead sent me on an infinite captcha solving tangent. It would have been nice to know before solving tons of them that this would never work and I was just wasting my time.


Same here. I got a gift card, went to the site to check its value, didn't get a cookie banner but they still hired this third party from the other side of the planet to try and identify which human I am, allegedly the only way to tell whether I'm human (y'know, a 21 digit account number isn't proof enough). The third party decided I'm not. Wanted to contact customer support, guess what? Form broken. Turned of tracking protection, ad blocker, etc. No dice. Tried a second browser, same issue. Called the next day and finally got to the amount... Of course they couldn't reproduct the ReCAPTCHA issue and the form didn't give them an Internal Server Error either. Must be me.

If this were about not being able to see cat pictures, alright, but this is about accessing money that I'm supposed to own. This is so backwards to me.


Which brings us back to the old way of using cash instead of gift cards, it might have a number of other issues, still it remains simpler.


Would downvote this if I could.

1) Google bough reCAPTCHA in the first place. 2) Google's latest captcha isn't even a captcha, you just click a button saying you are human and it analyzes your mouse movement and probably a fingerprint of sorts and you are in.

The article just isn't accurate and seems unnecessarily hateful about things that are not exactly true.


Google's latest reCAPTCHA is v3, which works behind the scenes to generate a 'trustworthiness' score for each user. There is no longer any button for the user to click. You're thinking of v2, I think.


I am talking about V3 it appears, just checked here https://www.google.com/recaptcha/intro/v3.html, at 0:11 on the splash video is the image I am talking about (https://youtu.be/tbvxFW4UJdU?t=11).

I guess I didn't mean a button, but a checkbox.


reCAPTCHA is turning into a punishement for those who don't want to consent to full tracking by Google.

Apple is working on their SSO project (Sign in with Apple); I hope they will also consider the use-case of being able to tell a site that you're a human without sending any information.


That article aside, what are everyone's experiences with reCAPTCHA? We are using it to secure one of our contact forms and get a lot of spam through it. Upon research (googling) I found that reCAPTCHA is "broken", e.g. https://threatpost.com/uncaptcha-googles-recaptchas/140593/


We use it to stop bots from opening thousands of trial accounts. We don't have bots anymore, but instead we have humans who open hundreds of trial accounts hourly. I guess it's an improvement.


Whats the service fir? I know game miners and some other niche services that attract these kind of actors. Do you know what they want from your site?


An email provider. Interestingly enough they don't even send anything. They just... sit there. And who knows why. Not really doing anything harmful afaik, but obviously fradulent so our support still kills them in bulk just in case they were a proper threat waiting for the activation.


Huh, interesting, thanks for sharing. Maybe they're just using it as an address to open up accounts for other stuff?


Perhaps – but I'm pretty sure there's some that don't even receive emails either. It's as if someone was just creating a "standing army" of accounts in preparation to unleash something. Spooky :)


Companies should just use bogofilter and not submission blocking on contact forms if they're annoyed about receiving spam all the time. That and then just go through the detected spam once in a while (once per week perhaps).

Basically the same they'd do with e-mail.

Several times I was prevented from complaining about something to a company that didn't have a public e-mail address, but only a contact form with captcha, just because that day, google decided they'll fail all my recaptcha attempts, and tell me "sorry we think your computer is sending automated queries" and wouldn't even allow me to try any futher.

It really doesn't help if I'm in a negative midnset about the company already. I've completely stopped using shipping companies that put tracking info behind recpatcha, for the same reason, although that's a different thing from contact forms and a bit harder to manage. But contact forms should not be blocked with recaptcha.

Outright blocking communication is a poor taste. Accept all communication and use automated mechanisms to filter through it on your side.

BTW, this is even worse for users that block re-captcha via adblocker or something, because you often lose the entire text you were about to send. So the next attempt you're even angrier.


My experience is pretty much the same as the article.

And because I value my time you risk simply not having me contact you if you use any kind of captcha.


I can’t reliably pass them without altering browser settings (and even then, answering correctly doesn’t seem to matter one way or another shrug).

I switch services whenever it is possible and I see a reCAPTCHA prompt.

(I am a human.)


Absolutely, trying to avoid any website with reCAPTCHA.

If you're non-Chrome user, don't even try playing with images - Google force you to click 3-5 times more than Chrome users, it's just stupid.


This! I use FF with uBlock Origin without being signed into Google and it is absurd how many tries it takes to solve the crosswalks, bikes, and storefront riddles. Sometimes I give up due to frustration and open the site in Chrome.


Seems like just another way for Google to funnel non-Chrome users into Chrome


Hate to say this, EU please help.


>Select all pictures of storefronts.

Hmm, if only I could read Bengali, I could tell if this were a storefront or not.


See also: “You (probably) don’t need ReCAPTCHA”[0]

(submitted to HN here: https://news.ycombinator.com/item?id=20158386)

[0]: https://kevv.net/you-probably-dont-need-recaptcha/


reCAPTCHA is the most annoying and presumptuous tracking spyware online. I've come to close websites almost immediately when I discover that they use it. "Discover" is the correct word here, as website developers are usually too careless, to provide any noscript hint that reCAPTCHA is even used and I find myself wondering, why that damn website does not work, until I check my uMatrix and see the evil that reCAPTCHA is. If the service is really needed, I need to allow stuff and reload pages up to 5 times, as script loads script loads script from other domain loads frame ...

This makes for the most shitty experience ever, when I try to use such a website and I give the middle finger to the person who decided they need to have a reCAPTCHA there and to the person who put it there and I tell them the F work in my mind. The disregard for people's privacy cannot get much worse than with reCAPTCHA.


My payment provider (Braintree) required me to implement reCAPTCHA v3, despite the fact that I had already implemented server-side fraud checks (MaxMind). I didn't have a choice if I wanted to continue taking credit card payments.


Do they specifically require reCAPTCHA v3, or just any sort of bot-prevention CAPTCHA?


The specifically told me to use reCAPTCHA v3. I don't know if they would have accepted another one. I didn't want to take the time to research them. Honestly, from a UX perspective I also preferred the "no user interaction" part of reCAPTCHA.


I'm glad someone is talking about this and we can have a healthy discussion about it. Things like PrivacyPass[0] are a step forward, but for those who don't know about such a tool, they will continue to be 'tortured' and get repetitive strain injury from constantly having to solve recaptcha v2, at least, well if they browse under Tor heavily and will have to pass recaptcha's test multiple times, and even after proving they were not a bot countless times.

[0] https://privacypass.github.io/


The problem with PrivacyPass is that it is a (not nearly as peer-reviewed as Tor) privacy-related cryptosystem that lets you bypass the reCAPTCHA that CloudFlare put in your way in the first place.

Using it with Tor is almost certainly not a good idea because it changes your own behavior from other Tor users thus compromising your anonymity (and the Tor folks are not in favour of PrivacyPass, because they think the solution is that CloudFlare shouldn't be putting the reCAPTCHA in the way in the first place). And that's assuming that the cryptography is actually solid and there is no way to distinguish between different PrivacyPass users. Tor has decades worth of research put into it -- what level of scrutiny does PrivacyPass have? How many people actually use it and how many have tried to break it?


I've said this before but:

> When 80% of traffic from an IP is malicious and the other 20% is regular traffic, but both sources look like the same traffic (impersonating browser headers, sometimes running headless chromium), what else can you do? Cookies and stateful cookie-like objects, such as privacy pass.


You will find the Google ecosystem is more joined up than you realise. If you use the DNS it will make the search engine results more personal, if you block everything to do with google at the firewall level, you will find recaptures don't actually work, you can spend an hour before it comes up with a web page where you have reached the end of all the recaptures you could possibly do. If you use Firefox, and write addons, you will find the addons don't work, their website api's, analytics, absolutely everything to do with Google is joined up so the internet is one giant Google Panaopticon. Don't believe me, try it for yourself, lock everything down at the firewall and see for yourself.


> If you use the DNS it will make the search engine results more personal

Google claim not to use DNS in this way:

> Is any of the information collected stored with my Google account?

> No.

> Does Google correlate or combine information from temporary or permanent logs with any personal information that I have provided Google for other services?

> No.

https://developers.google.com/speed/public-dns/faq


turn on comments on your blog without recaptcha, i dare you.


better yet, get rid of the comments all together. having no comments saves you from having to moderate them and it's just not worth it. if you're writing anything tech related and worthwhile you can just proxy the discussion to HN ;)


Not really a great option. People only really see stuff on HN that's relatively fresh so anyone coming in after the initial wave of engagement dies down won't see any discussion.


Also you can't discuss on old content anyway. It's blocked.


Yeah, somewhere around 2 weeks after a post they completely seal the comments on HN.


There are other techniques that do a reasonable job at stemming the flood. The "hidden field" technique still reduces spam by quite a lot.


I wrote about another technique at the end of my own captcha rant: https://inimino.org/~inimino/blog/kill_captcha

The proposed solution would replace captcha entirely, but to my knowledge nobody has tried it.


How about different revocable secret keys to your mailbox given only to those who you wish to contact you


If people have to request to be able to contact you, it is not a public inbox anymore. This is the way most walled gardens work, you have a separate step before you can interact, so it works, but it lacks some of the affordances of the public inbox model of email or blog posts, such as allowing anonymity.


That technique hasn't worked for years. Try setting up a vanilla WordPress installation with the most popular forms plugins. The only anti-spam measure that works is reCaptcha v. 3.


From experience, Akismet works well, so reCaptcha is not the only anti-spam measure that works for WordPress.



For WordPress there are plugins like Akismet (a service), Antispam Bee (local), etc, that are pretty good at filtering spam without the need to display annoying captchas.


While sending comment bodies to these services :) what a privacy


that's just a tell that you value ease of moderation and selling your potential community members out to G more than inviting new members to your community. certainly telling of the person running the blog.

i've had success with bayesian filters and shadowbanning myself, but it does require some effort.


It's quite the rant, meanwhile I don't quite remember the last time I did something where I had to solve a captcha that wasn't the "click once" one.


Try using Tor for a day or two, you'll get reCAPTCHAs on almost every website and in many cases you need to fill multiple out with increasing levels of distorted images. The best one is CloudFlare which can not only ask you to solve a reCAPTCHA once -- but several times when trying to load a single page. And sometimes Google will even refuse to give you a challenge at all because your exit IP is "especially dangerous"!

If 10% of web developers used Tor on weekends, no website would use reCAPTCHA because they'd realise how painful it is to certain users. I run a Tor relay (non-exit) at home, and now I get more reCAPTCHA even though there's no possible reason to assume my home IP is "bad". I'm still going to run my Tor relay -- I just think it's interesting to note that users are being punished by a giant MITM-as-a-Service company for trying to help other people use the internet anonymously.


Thank Cloudflare


To offer a counter-experience, I don't remember the last time I actually HAD the "click once" captcha. It's always "click the buses/traffic lights/store fronts" etc.


Your experience will differ based on how much Google tracking you block. If you're not letting them surveil your every move, they're less convinced you're human by default (or perhaps spite).


I have a hard time assigning malice to the recaptcha more blocking = lower score because while it's a fun conspiratorial position it's also true that the more you block the less you look like the average user who doesn't. Also the less info they can pull from to determine how likely it is you're an actual person so of course people without a trail are going to be more suspicious.


Wow this makes sense now. I switched to Firefox as my primary browser 7-8 weeks ago and enabled strict blocking; and as of the last ~week, I've been greeted with reCAPTCHAs on a ton of websites which previously just let me in.


I have lots of blocking turned on.

I imagine it's partly because I don't block cookies (I whitelist sites that get to store them across sessions and everything else is then session only).


Same here. I'm pretty aggressive in trying to block trackers (ublock, Firefox containers, privacy badger), so perhaps it's due to that.


I have to solve captchas all the time these days; typically, several in a row. It's aggravating as get-out.


I get the puzzles quite often, pretty much every time I deal with a captcha. Private mode browsing, logging in from my pc, firefox vs chrome... I'm not even particularly tracking sensitive. I don't even have extensions more extreme than ublock origin and a strict popup blocker on firefox.


Firefox user, rolling with uBlock and blocking all Google cookies.

Picture captcha every time.


Try setting privacy.resistFingerprinting for the extra challenge.


I've found that breaks things in subtle ways. I have a Plex server connected to my terrestrial antenna to watch TV and the TV guide was showing everything an hour out and I couldn't not for the life of me work out why. Turns out that "privacy.resistFingerprinting" makes your JS timezone UTC, whereas I'm BST.

I'm not sure why they bother, since I'd find it more suspicious if say someone is coming from a Russian IP address but has UTC set and not a Russian timezone...


You absolutely should open an issue about this on FF's issue tracker. Assuming you're not hiding your IP this indeed is actually privacy.helpFingerprinting and should be fixed.


But surely it's by design?


Then it's bad design.


if you aren't logged into a google account, you will always get escalated to puzzle challenges


Or if you are using tor, are blocking tracking, etc. I regularly have to solve 5 or 6 puzzle challenges in a row.

I would happily pay a monthly fee to get around these ridiculous captchas, even though it's absurd to have to do so.


Is that particularly surprising though? Coming from a TOR exit node where a bunch of spam also comes from and blocking tracking so it looks like you're appearing on the site from no where... Both of those are pretty suspicious and reasonable things to crater your humanness score.


Except solving reCAPTCHA shouldn't be necessary in order to read a website. I get (though still don't like) the justification for any modification actions, but GETs shouldn't ever trigger a reCAPTCHA check.

Of course, the Tor folks told the CloudFlare folks about this many years ago and CloudFlare still acts as a giant censorship machine and continues to block anonymous users from reading content on the internet. Not to worry though -- you can install their extension[+] to "protect your privacy" to bypass the reCAPTCHA that CloudFlare themselves erected in front of other people's websites! It's definitely not in any way comparable to an arsonist selling fire insurance as a side gig -- at least with fire insurance you actually got something out of the exchange!

[+] Which does have a paper that explains the security of the cryptosystem, but a single paper does not make a protocol secure by default. I'm not a cryptographer, but the Tor folks did raise some concerns in the issue where PrivacyPass was discussed, and there's no doubt that combining Tor with a system that is nowhere nearly as battle-tested should be a major point of concern.


Many site operators try to spend the absolute least amount of money on servers, so it's common for simple "GET request spam" DDOS attacks to take down a website (especially on dynamic, DB-driven sites). CF in this situation is helping these site owners take the easy road and recaptcha DDOS attacks instead of scaling up servers or having to implement smart caching strategies.


Let me present an alternative framing: A business doesn't have to allow you to access and use it's content in any way you would prefer to access and use it. Part of the bargain of the web is you get a lot of content for free at your fingertips because someone else is paying for the servers. Right now that's mostly ad money because it's simple to add and doesn't require companies to change their content much. Ad blocking, tracking blocking, and anonymizers are all circumventing the funding model that makes a lot of the web possible.

Do I wish there was an alternative to aggressive ads and tracking? Hell yes. Do I want to pay every website individually for what I view? Nope, and companies don't either because it would massively hurt their growth for people coming in new.


I don't know why you're talking about online advertising. People use Tor for a wide variety of reasons, many of which are completely unrelated to whatever business model the target website has. In addition, my comment was about CloudFlare (a MITM) putting reCAPTCHA on other people's websites -- I don't see how advertising is even slightly related to that topic.

I would argue a "better" framing designed to emotionally manipulate would be "why are you trying to block people in oppressive regimes from being able to read about the outside world and organise themselves, putting them in danger of being murdered by their government"? But it would be dishonest to make the discussion about "why do you want to kill people", just as it is dishonest to make the discussion about website business models.


CloudFlare isn't just going around randomly putting their caching/filtering in front of websites they're choosing to have it there though. The sites are choosing it.

Advertising is related here because recaptcha's use of tracking, primarily used for advertising, as a factor in determining their score for users and also because blocking ads/tracking is part of the cause behind people's issues with recaptcha.


It is also problematic when you wish to use external download managers (which I always do).


I've had the picture ones once it twice this year and it only took a couple seconds. The pictures don't take much longer than the text did for me but I wonder if it gets worse on certain configurations (I've heard recaptcha is super bad if you use TOR).


Chrome user?


What's the next best alternative to reCaptcha? this is something I will be in need of soon.


People on the other side of the world that solve captchas for $1 an hour. Sadly, I’m not kidding.


Companies use recpatcha in inappropriate ways. Especially on contact forms. Imagine you'd have to solve a captcha before sending every mail (or possibly be rejected completely).

Contact forms should just send an e-mail and let the e-mail's content based statistical filter deal with spam.

Blocking people from communicating with your company based on Google's whims is really not smart. It's giving too much power to Google.


I got dragged into clickbait, read 15 paragraphs of fluff till I actually got to the point of the article, and now I hate myself.


Another technology that needs to be regulated and banned in many cases. Paying bills and handling finances online should be accessible to everyone. This prevents it from being so. It may already be a violation of the ADA. After all, how is a blind person supposed to pass? Impossible. For other websites, well they can do whatever. The internet is mostly a cesspool and anything that can't be visited with JavaScript turned off is not worth visiting. That includes "clever" spa blogs that could just be static sites and all types of other garbage. Maybe one in a thousand or one in a million sites other than finance, shopping, etc might be an exception to that but no website with recaptcha is. I wonder if we can sue Google for ada violations. I'm sure there are plenty of disabled people on the internet for q class action.


> After all, how is a blind person supposed to pass?

There's an audio captcha option.


I have a friend who is both blind and deaf: audio is also out. It is possible to use a braille terminal to access the web, so while I don't know if he uses the internet, it is possible.

Okay, he isn't really a friend, I only met him once. We have a friend in common who speaks sign language and was able to translate. Seeing him read sign language with his hand was interesting.


I haven't seen that option in the picture captchas, probably because it can now easily be defeated. So the question still stands.


So bots can request billing and finance services and take it down ?


What's the alternative to the recaptchas? Is the difference to just to use a different captcha service?


CAPTCHA is a divisive topic on HN, so you'll get a lot of people suggesting it's vital, and a lot saying it just isn't, depending on their own experiences.

"You probably don't need ReCAPTCHA" [0] is an article discussing techniques that has had decent discussion [1] on HN before.

[0] https://kevv.net/you-probably-dont-need-recaptcha/

[1] https://news.ycombinator.com/item?id=20158386


Proof of Work is one that can help, but the HN crowd does not like those.


There's plenty to write about this, but this is just a rambling rant that struggles to stay coherent


This article seems to be confused about reCaptcha v2 vs. v3, as others have mentioned. That said, the broader point about the amount of time they take and the fact that Google is farming out labeling has some validity. It seems like the value to Google is high enough that they could easily remit some money or value to the people taking them.

There's an argument to be made that this would incentivize even more investment in bots that can pass them. My reply: Google should just create a bounty for anyone who successfully beats it worth more than the black hat value.


I wish they mentioned https://github.com/dessant/buster which I love using


I joke about signs of the robot apocalypse eventually showing-up in reCAPTCHA:

"Click on the photos of humans with weapons."

Sometimes I feel like I'm only half-joking, though.


life's too short to solve these things - i view it as the equivalent of a rude host at a restaurant (meaning I'll take my business elsewhere)


I just tweeted a video of me doing nonsensical "Rotate the ball" puzzles to login to Hotels.com. I couldn't even tell what the pictures were! https://twitter.com/AustinZHenley/status/1158430726888407040


The post talks about recaptcha v3 but shouldn't it be v2? Because correct me if I'm wrong, but as far as i remember recaptcha v3 does NOTHING with the user as far as i remember and only tells the admin what it thinks about the user and then he can spawn a normal recaptcha v2 if needed.


Google moved puzzles from interviews onto random Internet users ;-) There must be some sophisticated internal metrics registering increase of puzzle takers somewhere, with a hidden goal of increasing general intelligence levels, adversarial-style. It's the only scientific explanation! :D


I didn't realize how bad reCAPTCHA is until I started trying to protect my privacy.

The internet became practically unusable thanks to the constant, unsolvable CAPTCHAs. You can click the correct image tiles until your finger falls off but you still won't get through.


I agree completely. I've just stopped using any website which asks me to do reCAPTCHA. In fact, I've got uBlock to block it from even loading. If you use reCAPTCHA for any reason then you don't deserve my business.


One of my favorite takes on captchas is John Mulaney's SNL skit from the past year: https://youtu.be/en5_JrcSTcU?t=337


Recaptcha is now a rate limiting service for people not signed into their Google account. That's about all it does. The score on v3 falls through the floor if you block third-party Google cookies.


I agree that reCAPTCHA (all versions) are terrible. (At least, sometimes it is possible to avoid the problems by selecting the audio CAPTCHA, which tends to work better as far as I can tell.)

Instead, use the protocol-independent CAPTCHA. It is a SASL mechanism, which sends a challenge with plain ASCII text (and may include line breaks), and then accepts a single response of plain ASCII text, and then the server decides whether or not the response is acceptable. The similar thing can also be done with a simple HTML form, but using SASL would then allow working with any protocols and work with command-line interface just as well as HTML interface, too.


Didn't facebook try "Is this your friend Bob?" (showing you a picture of some random people) as their forgot password captcha, in the past?

Now that was evil!


In India it is common to set your profile pic as flowers and little girls for women, supercars and superbikes for men.


Not that I would do it, but couldn't someone create a bot to purposely give incorrect results and screw up Googles AI learning machine?


I’m curious, what’s the best option for protecting web forms if not using reCAPTCHA? Specifically for things like account sign ups?


It depends what value there is in an account, but for a service I run, I just let the bots sign up accounts.

Accounts don't really cost me anything, and they get automatically deactivated if they didn't get any activity in the first 30 days anyway. Activity on my service costs money, so if someone wants to make a bot that pays me money, I have no problem with that.

The only time I'd consider implementing something like reCAPTCHA is if I was giving something away for free (e.g. a free trial) such that a signup actually had a cost for me.


I was more concerned about triggering activation emails to people


If you collect email addresses, then yeah that's a concern. Then again, if you send a single activation email and never send another email unless the link is clicked, there's no value to the bot in signing up accounts, so it's unlikely to be a major problem.


If recaptcha is so good at finding bots, then why does google still show ads to bots?


reCAPTCHA on a site forces me to use Google and quite a few sites now use it. There are other ways of detecting bots that aren't as intrusive nor require you to allow Google everywhere.


Ug, can't stand reCAPTCHA. But seems like a necessary evil.


the tone of this article is awful. it also has the history of recaptcha wrong.

I can't imagine it actually taking 30 seconds to solve a reCAPTCHA. That needs a citation.


In Firefox, log out of your google account, block cookies and enable the anti-fingerprinting features. Or just browse with Tor Browser.

You will start getting challenged more often, you'll find you're asked to solve several multi-select challenges in a row even if you get them right, the multi-select challenges will replace tiles after you select them, and the tiles will start fading in and out very slowly.

See https://www.youtube.com/watch?v=en5KSZSpDFY for an example video.

These are (AFAIK) intentional features deployed Google - you just don't see them if they can already track you via something like your Google account.


I have noticed recently that when using Firefox the need to repeatedly spot traffic lights or fire hydrants 5 or 6 times in a row seems to have gone away. It now seems that I only need to do it once now so perhaps someone at Google fixed the glitch?


Ok I did as you suggested and then went to https://www.google.com/recaptcha/api2/demo

Took me all of 5 seconds to solve


The most insidious thing about this two-tiered surveillance society is that its effects are invisible to people in the "good" class, and they insist on remaining ignorant of what happens if you're not in that class.

Google still has plenty of other tracking points on you. And of course reCAPTCHA looks reasonable to a prospective developer - it wouldn't be adopted otherwise!

If someone tells you it routinely takes 30 seconds to solve, you should really just accept their experience rather than discarding it with "works fine for me!". If you still need to see it with your own eyes, go setup a TOR browser and become one of the undesirables, rather than just imagining. You just might end up adopting the marginalized view.


Try using Tor and a real website that is using reCAPTCHA (I don't think they have their actual scoring system enabled on the demo). GP posted a link to a video of someone solving a single reCAPTCHA prompt for 2.5 minutes. I also regularly have to solve several CAPTCHAs in a row, often with increasing levels of image distortion.

Before being so brazenly dismissive of other people's experiences, take the few measly minutes it would take to actually try it out. Even doing a simple Google search with Tor Browser usually gives you a reCAPTCHA to solve.


> being so brazenly dismissive of other people's experiences, take the few measly minutes it would take to actually try it out

?? I did try it exactly as was requested of me. I didn't know where to find a recaptcha so I went to the demo page.

It's possible the video is the result of a bug and not normal behaviour. I wouldn't know as I don't often browse with tor.


I've experienced that in the past, without using tor, but using Firefox. I go to great lengths to block anything Google. Google Fonts, Google CDNs (and I now use Decentraleyes so CDNs are mostly not even reached). I have to explicitly unblock reCAPTCHA scripts for it to even work - or use a clean and disposable browser profile. I've occasionally seen the behavior on the video when I really need to access the page (though usually I just complain to the webmaster).

I once had to request a new password for my online bank account. I ended up asking the bank manager to reset my password, pretending that reCAPTCHA is preventing me from resetting the password myself. My bank is not paying me for solving captchas for Google's benefit (this is so screwed up…).

The biggest offender is CloudFlare for me.


> I didn't know where to find a recaptcha so I went to the demo page.

Well, now you do (though I was a bit of an asshat in my parent comment, sorry about that):

> Even doing a simple Google search with Tor Browser usually gives you a reCAPTCHA to solve.

You can get the Tor Browser from [1].

[1]: https://www.torproject.org/


It sounds like you intentionally are making it very hard to distinguish if you are a bot or a human?


No, I'm intending to make it hard for sketchy third-party ad companies to track me. Some ad networks can't even avoid serving malware, it's not realistic to expect them to competently safeguard my data.

It just so happens that Google's method for telling if I'm a bot or a human isn't just to show me mangled text, but also to use ad-network-style tracking.

In Google's defence this is probably an attempt to deal with captcha-solving-by-humans-as-a-service or something like that. I can't imagine Google employs many privacy enthusiasts who would spot a problem like this by dogfooding.


They have been putting some users through multiple challenges with a forced delay/throttling - so you need to actually answer 4 or 5 challenges with a minimum time per challenge of perhaps 5-8 seconds etc.

These users appear to be Firefox users who sandbox Google cookies. That said, it seems to have improved a lot recently (only one challenge)


In the past, I often spent more than 30 seconds on each recaptcha. It really is a problem. That improved by allowing google cookies but now every webpage with recaptcha I visit is being tracked by Google. :(


> I can't imagine it actually taking 30 seconds to solve a reCAPTCHA.

It happens to me quite frequently... but I block a lot of tracking. Sometime I give up if I was browsing just for fun... but my bank sometime use reCAPTCHA.


Forget your password a few times on a site that uses it (logMeIn and its products). You'll get an endless stream of challenges. I usually just give up and try again later


I prefer the newer captchas personally.


Google sucks nowadays. So much of what it does just makes technology and the web worse, not better.


tldr: Author claims reCAPTCHAs will get incrementally harder and force users to do strange things like turn on their webcams or open other devices to confirm their identity. Eventually people will pay Google to bypass the system.

All plausible...


The current new captcha is invisible. It's no less evil since Google recommends to put it on all your pages...


Well, it's only invisible if all pages / tabs you visit share the same cookie / session store. With browsers implementing more and more of privacy-focusing features this becomes less and less frequent.

If I use Safari private mode (each tab has its own cookie store) or Firefox with containers and cookie auto delete extension every tab I open, every page I visit gets a brand new Google session. Obviously, Google treats my like a bot.


Not true. Google has made a new capture system that only requires you to click a checkbox.


Try using Tor with one of those captchas -- you will get a whole load of puzzles and picture matching goodness (usually having to solve multiple in a row). It can easily take me several minutes to pass a reCAPTCHA challenge when using Tor. And even then, sometimes Google will even refuse to give you a challenge at all!


…which will only work if Google can track you.


Hmm I don't know. It's been 10 years since I've been hearing that self driving cars are right around the horizon. And we now know we're nowhere close to it.

If this recaptcha helps google, or for that matter, any company, accelerate their self driving capability, I'd fully support it.


Billions of tax dollars and breaks are going to self-driving research, which is fucking insane. America needs to build back it's public rail infrastructure. Passenger rail is a solved problem, and the US use to have more passenger rail that Europe. We have one of the best freight rail systems in the world, but we have high speed rail in New England and Florida. That's it. Maybe California if they can get their shit together.

I wrote an article about this before, and talk about how even if you had a four lane high way with nothing but self driving cars, all filled and traveling at 120kph bumper to bumper, you wouldn't even come close to approaching 10% of the capacity of a single track metro light rail line running with 2 minute headway (during rush hour headways can be less than 2min, and with automated systems like those in Singapore and the new ones coming to London's underground--DLR is already automated--you can get headways of less than a minute or even 30 seconds)

https://penguindreams.org/blog/self-driving-cars-will-not-so...

Sure self driving cars could help in Europe with the last leg where they have real transport infrastructure, but the US is so far behind that successful self driving cars would just add to grid-lock.


This discussion has been well hashed before, but comes down to the US being very big. People are far more spread out here than anywhere in Europe or Asia. This also makes last-mile transport even more critical if you aren't close to the station. The distances also drive up the cost and waits between trains, further reducing ridership.

High-speed mid/long distance passenger rail just isn't viable given our population densities, and we already have metros/subways/light-rail in core metropolitan areas. Without major construction with hundreds of new lines and a shift in city planning, it's unlikely to ever change in the US.


That is the worst possible argument. Australia has a fraction of our population and every one of their capitals (except for Darwin and Hobart) has a very good rail system. Sure they don't have high speed, but they could easily saturate a Sydney to Melbourne high speed route (if Melbourne didn't waste a few billion on their ticketing system).

Russia has high speed rail and it's less dense than the United States. China created their high speed system in less than a decade.

This argument comes up all the time and it's so poor. The United States use to have more passenger rail than Europe does now! You build light city rail and immediately, new housing and commercial stuff pops up around it. It can potentially reduce drunk driving as well.

The US/density argument is really tired and just doesn't hold up when you really look at it.


We do have passenger rail, it's not fast but you can travel the whole country. We also do have intra-city surface and underground lines.

The issue of why we don't have more and faster rail is a multifactorial problem of city size, zoning, spread, and alternate transportation. The US only has 3 major population centers, and they're very wide. The rest of the population lives in thousands of small and midsize cities spread far apart.

This is the worst combination for expensive railways and stations, and requires solid last-mile coverage. This means cars, and if you have cars then they already provide similar speeds, better coverage, more freedom, and lower costs. The only feasible plan is high-speed long-haul that doesn't stop anywhere in the middle, but demand for that is weak because people do live in the middle, and air-travel is faster and cheaper at those distances.

Lower population with a few big cities like Australia is a much better fit. If the US just had 3-5 major cities all along a single coastline then we would also have a similar railway network.


Correction: Russia doesn't really have high-speed trains, except one tiny line which doesn't count if you consider the size of the country.


It’s quite generous to say that Australian capitals have very good rail systems. They exist, sure. Patronage (especially in sydney) has been growing at a huge level (partially due to the traffic nightmare, a side effect of the geography). But it’s a long way from good or optimal.

But yes, if we built high speed rail linking Melbourne - Canberra - sydney - Brisbane we would have one hell of an east coast system


We could stop all of the subsides to people that live in areas unsuitable for transit -- subsidies like access to cheap (well, free) roads, cheap fuel, etc.


How is giving people free anything mean we are not subsidizing them?


My point was that we are subsidizing them by giving them free/cheap single-user transportation options. I clarified my comment


Your facts are irreverent. High speed rail isn't useful for the NYC-LA travel. However the NYC to Atlanta, GA route is much shorter and would work well for high speed rail. That is just one of many possible routes where good rail would be good.


How can facts be irreverent?

What is special about NYC to ATL? The distance is about 900 miles via road. The land costs alone would reach into the billions, and there are lots of cities in the middle that will require stops, which severely slows down travel time. And how many people are really commuting between both endpoints? And how many would choose a train over a plane?


Those are good questions, I'll give a quick summary of the answer. Others have gone into a lot more thought.

NYC to ATL is special because the distance is short enough that people would take a high speed train over the plane - even if the train is slightly more expensive the overall travel time (including the hour advance arrival at the airport) is similar and the train gives you more space/comfort.

This is a non-stop train. Most of the cities in between would not get a stop! Not giving every tiny town along teh way greatly speeds things up and is critical to making it work. Other big cities along the way will get their own non-stop train. Little cities may have a low speed rail connection to the big city, but it cannot be on the same track.

This is not a commuter train. People who live in one city and work in the other should move. People who work in one city, but sometimes have face to face meetings in the other will take the train.

We already see the New York to Boston train having success using this model - the distance is smaller, but the trains are slower. We can use this as a model for how many people would take the train over the plane and why.

You are correct that it would cost billions. That is something that needs to be carefully considered. People have done the math and say it works out, but it is worth checking their assumptions. The construction costs are amortized over many years though, so the costs shouldn't be too bad.


Some thoughts: I think rail adoption becomes a lot more feasible if people can summon self driving cars on demand. The problem with rail is that it's hard for people who are currently driving to get behind because its usefulness is dependent on how built out the network is and building out rail networks is slow and expensive. And while the network has low coverage, getting to and from rail stations is a pain in the ass. But if you can just hail an Uber or Lyft, but the Uber or Lyft is really cheap because it doesn't require a human driver, then getting to and from the rail station becomes a non-issue. And I think that can turn the all or nothing paradigm of rail as it currently exists into something where incremental progress is just as useful whether you're starting from 0% existing coverage or 80% existing coverage.


> But if you can just hail an Uber or Lyft, but the Uber or Lyft is really cheap because it doesn't require a human driver, then getting to and from the rail station becomes a non-issue.

We're a long ways away from self driving cars being really cheap. That self driving Uber network would need a lot of capital investment that they initially were able to avoid because they piggybacked on drivers already having to own cars anyway.


I agree that A) the technology is not there, and B) it would require a lot of capital. I still think it's the most realistic approach to passenger rail utilization in most of the U.S. It will take decades to build out high coverage passenger rail in the U.S. That's a lot of time for technology to mature and for capital to be deployed.


I am certainly on your side when it comes to mass rail transit being a really good solution for inter-city transportation. Especially now in the age where intra-city transit is so easy with uber/lyft.

The problem as it stands is land easements. Acquiring land in the US for larger social benefit at the expense of the land owner is generally A BIG FUCKIN NO NO. This premise has essentially locked many well planned rail projects from the get go. This just isn't a problem for the promise of self-driving car technology. Further more the opportunity cost argument is really weak. Al la "had we not spent so much money trying invent self driving cars we could have improved our rail system X times over ect." Because we have developed so much other technology out of that quest.

I dream of a day where there is a high speed rail system that connects the entire west coast. It would be a GIGANTIC boon for trade and commerce IMO.


Self-driving cars were right around the corner in the 1980s, too. There was an AI conference shortly before the last AI Winter where this was remarked upon. I'm kicking myself for not saving the transcript; self-driving cars, worries of regular people losing their jobs, clueless reporters grossly inflating regular people's expectations, etc.




Solving self driving will go hand in hand with solving flying cars. When cars can fly self driving is trivial and lanes above can be controlled. The streets can go back to bikes and walking and for older cars.


flying requires an order of magnitude more energy compared to driving. It won't be be a feasible method of mass transportation anytime soon.


That's how long it will take for mass adoption of self driving cars.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: