For the car captchas, I've found actually clicking all the boxes with part of a ...

bildung · on Jan 23, 2018

It was the same when this was about words from old books. I always had to fill in letters the average person would have thought it to be, not what it actually was (e.g. the letter "f" for what really was an "s" in gothic type).

Nowadays it's much easier, you can click anything that looks vaguely the same (e.g. boxy things for cars, ads for traffic signs, traffic signs for store fronts etc.). The fact that it's so easy to poison the training set makes me very wary about the autonomous car future...

MisterTea · on Jan 23, 2018

I actually like poisoning them. Not to be malicious but I feel manipulated into training their software for free. "Oh you wanted to sign up for that web forum? Sorry, but you have to do some free work for us first"

And if you think that it's somehow good because it's mutually beneficial to train AI to better the future of humanity, don't. That is what their marketing department wants you to think.

dazmax · on Jan 23, 2018

The value they're providing is to the forum owner, who has reduced spam-handling workload.

So the forum is providing you value, you are providing Google value, and Google is providing the forum value.

kuschku · on Jan 23, 2018

Sure, but if the public has to provide Google with free data, we should make laws requiring Google to open source the entire ReCaptcha training set.

operon · on Jan 23, 2018

kuschku · on Jan 23, 2018

Anything created with free work should be free in return. Anything created by the public should be available for the public.

sangnoir · on Jan 23, 2018

To use an in-thread example - online forums are created with free work. Should all forums be forced to make their archives available for free download as well?

kuschku · on Jan 25, 2018

For free download? No. Should I be able to scrape them? Yes.

stolsvik · on Jan 24, 2018

Trundle · on Jan 23, 2018

It's not free though. You get access to the forum.

If it was free then you wouldn't be doing them!

snupples · on Jan 23, 2018

Or it could be both what their marketing department wants you to think, and also reasonable.

wott · on Jan 23, 2018

I have more problems with bridges than with cars. The damn thing forces you to select 3 bridges, except that there are only 2... So you are forced to select what it thinks is a bridge, and confirm his erroneous bias even more.

Chaebixi · on Jan 23, 2018

Storefronts are difficult too, I don't think I'm ever good enough at those to satisfy it. The most reasonable one seems to be street signs, but I think it fails me for not flagging the unpainted back of one.

zhte415 · on Jan 24, 2018

Sounds like poor question asking. "Click any square with a bridge" would be better wording.

robryan · on Jan 23, 2018

I had to do 20 or more of these to get some post tracking data recently. Found the same thing with most objects where a very small part of an object hadn’t been classified as containing that object.

0xTJ · on Jan 23, 2018

Does it tell you that you're wrong, or simply give you another set. If it is just giving you another set, it may be that it thinks that you can provide more useful data.

csallen · on Jan 23, 2018

It just gives you another set, but it's so frustrating and annoying as a user that I can't believe they're doing this on purpose.

Chaebixi · on Jan 23, 2018

If you fail, it tells you you were wrong (some red text at the bottom) and gives you another challenge. I think it will sometimes just give you another challenge for more data, but it won't have red failure text.

Chaebixi · on Jan 23, 2018

> This creates a twisted Turing test situation where, to prove you are a human, you have to pretend to be a machine's idea of what a human is.

Exactly. I think Recaptcha was better when it was looking for consistency with other human answers. Using "AI" has the same problem you mentioned, plus its more vulnerable because it has the assumption that your "AI" is unapproachably far ahead of competitors.

nerdponx · on Jan 23, 2018

Street signs are a problem as well. Is the post part of the sign or not?

aswanson · on Jan 23, 2018

Twisted Turing test. There's a novel waiting to be written about this.

thom · on Jan 23, 2018

I went through exactly this today. Wasn't sure if it was the computer being dumb, or other people missing corners of signs or bridges etc. Still, takes me 5 goes every time.

pbhjpbhj · on Jan 23, 2018

The problem with the cars for me is their definition -- is that van classed as a car, how about the half-back, what about a 4x4, what about a 4x4 with no back windows, ...

indigochill · on Jan 23, 2018

What about a square that contains just a sliver of the car from the next box over? Does that still count? I hypothesize that my attempt to classify every pixel related to a car as a "car" may contribute to my failing of these tests.

toyg · on Jan 23, 2018

This is what kurtisc is saying. The algorithm is unable to classify those pixels, so you have to guess which bits of the car can be detected by the algorithm and then select only those. So you have machines asking humans to think like a machine in order to prove they are human.

js8 · on Jan 23, 2018

That was my experience with car and sign captchas as well..

But interestingly, it also depends on my mood, when I feel lazy, I click fewer boxes.

zer0t3ch · on Jan 23, 2018

Isn't it supposed to learn from you? IE, answering technically correctly but what it thinks is incorrect is slightly annoying for you, but better for the system in the long run. (ie better for everyone)

moreless · on Jan 23, 2018

Everyone? Or Google? I don't feel particularly happy about being used as a lab rat, so no, thank you - I don't care about the quality of Google's AI. If anything, I would purposefully mislead it if I knew how.

amrrs · on Jan 23, 2018

Don't you think that's how a lot of training data can be generated efficient for future ML/AI breakthroughs?

kbart · on Jan 23, 2018

So we will get even more "targeted ads" in our faces? No, thanks. I think ML/AI has a great potential (especially in medicine), but I just don't trust ad companies to use it for any good cause.

JshWright · on Jan 23, 2018

Has Google published this training set somewhere? Until they do, you're absolutely right that this is a great way to build a training set, but I don't see how it's to anyone's benefit but Google's.

dboreham · on Jan 23, 2018

I find the same thing with the road signs : other people are giving (imho) the wrong answer by not including squares that cover a small, but non-zero part of the sign.