Another possible way to do this is present the Turker with two vertical pieces a...

aantix · on Nov 27, 2011

Slightly offtopic, but I am the author of the Rails/Mechanical Turk integration gem, Turkee ( http://www.github.com/aantix/turkee ).

The lower the cost of the HIT (and the easier the question) the more you want to make sure you're screening for bots.

Make sure you have a gold standard question. For every HIT ask two questions; ask the legitimate question that you need answered and then on top of that, ask a simple "gold-standard" question that will determine whether the answer submitted is a bot or not.

For the gold standard, I like to choose two numbers, and store the sum of their values. Along with your HIT question, be sure to ask the Turker the value of their sum.

E.g. For a survey you ask the Turker "What is your favorite color?" But then along with that, you ask them "What is four plus fve?"

Be sure to misspell one value (e.g. remove a letter, etc). If the Turker doesn't respond correctly to the "gold standard" question, programmatically reject their response.

pama · on Nov 27, 2011

Not sure if this matters for a "gold standard" question, but google answers it correctly:

http://www.google.com/search?q=What+is+four+plus+fve%3F

jethroalias97 · on Nov 27, 2011

But then I think you are giving up the main advantage of using mechanical turk over just roughly matching pixels. It is possible matching pixels will get the wrong answer where a human, with the context of being human, will know what the image is supposed to be and get the write answer, captcha style.

Example: an iron fence with a scenic view behind and where the slices are too small for someone to match only two.

jarin · on Nov 27, 2011

Speaking of CAPTCHAs, a smaller version of this would make an excellent one.

Er, well maybe not, given the ease of programmatic solutions.

thesis · on Nov 27, 2011

This would be beat with a very high success rate if it were a captcha.

jarin · on Nov 27, 2011

What about doing the same thing, but fuzzing a few pixels from each edge to avoid pixel distance matching? You could even make the edges just plain black for a few pixels from each edge. The human eye would be able to complete the image (it would just look like it has black bars), but I'm guessing it would be enough to prevent a simple algorithmic solution.

Edit: This is essentially the same idea as the iron fence posted above.

notatoad · on Nov 27, 2011

At that point you could probably just do some simple edge similarity tests instead of sending it off to the Turks.

on Nov 27, 2011

[deleted]

mikeknoop · on Nov 27, 2011

You're right, you'd have to get creative with some of the sorting algorithms, but something like Merge sort would be perfectly suited.

jakobe · on Nov 27, 2011

You can't use merge sort or any other standard sorting algorithm in this case. All these algorithms assume you can compare any two elements. To use a sorting algorithm, you would need to be able to say for any two images which is further left.