Can optical illusions fool AI too?

matsemann · on Oct 26, 2018

I recently had fun implementing [0] some of the concepts/math for ambiguous cylinders by Kokichi Sugihara. Some of his first illusions [1] came from researching computer vision, where the computer should understand how a 2d picture supposedly looked in 3d. When feeding it some optical illusions, he realised it would be possible to make these objects in real life!

[0]: https://github.com/Matsemann/impossible-objects [1]: http://www.isc.meiji.ac.jp/~kokichis/anomalousobjects/anomal...

andrewla · on Oct 26, 2018

The direct links to the stl files in github let you rotate it around and get some idea of how they actually work [1][2]. This is absolutely mind-blowing!

[1] https://github.com/Matsemann/impossible-objects/blob/master/...

[2] https://github.com/Matsemann/impossible-objects/blob/master/...

singularity2001 · on Oct 27, 2018

I don't get the first one but squaring the circle is cool

acqq · on Oct 26, 2018

Wow, the best illusions I’ve seen for a while! Thanks!

matsemann · on Oct 26, 2018

Sugihara just recently (again) won the Illusion of the Year contest. He has made lots of fun stuff over the years that's worth checking out! [2]

I was lucky to attend "FUN with Algorithms 2018" earlier this year, where he was one of the invited speakers. Seeing the illusions IRL blew my mind and I had to make them myself.

[2]: http://www.isc.meiji.ac.jp/~kokichis/Welcomee.html

wolfhumble · on Oct 26, 2018

Video of the 'Best Illusion of the Year Contest - 2018': https://www.youtube.com/watch?v=iA5zBZB2dng

Should be watched for just the music itself! ;-)

rollulus · on Oct 26, 2018

As far as I know, do humans perceive the rotating snakes illusion because our eyes quickly jump around (saccade). Stare at one point and it stops rotating. I skimmed the publication, but the word saccade is not in it.

I feel with this AI hype, we are desperate to find human like traits in these systems.

xxs · on Oct 26, 2018

I wonder how far we are from the next AI winter.

AIs have touted with any task to upmost hype Nvidia GPUs come to mind - help enhance MRI or invent pixels upscaling games.

Trying to sell neural networks as the next magic incarnation seems to be the default route of many marketing departments.

antpls · on Oct 26, 2018

I'm rather young (25-30), so I didn't get to know the first "AI winter", but to me there is still potential for improvement of neural networks. Both hardware (new DSP cores) and theories related to NN (transfer learning, layers architectures) can be improved, and also new business applications who are not yet developed because of the impact on people's jobs. We didn't reach the full potential of neural networks, IMHO. Wait before we start connecting 100's of today's neural networks together and see what happen.

On a bigger scale of time, we could say the "AI winter" actually never existed, as all those last decades were dedicated to developing the hardwares and concepts we needed for today's algorithms. It was a long iteration of the loop 'hardware <-> algorithms <-> applications <-> new hardware needs'

"AI", as I imagine it, is not a product someone can sell. It's not tangible, it's hardly measurable, it's not only one technology, it's more a philosophical concept too me.

Edit : As for example, there is right now a link titled "Generating custom photo-realistic faces using AI" on the front page. It's misnamed, as an actual "AI" would probably not let itself be "used". The title would be better named "Generating custom photo-realistic faces using Neural Networks" (or "Deep Learning", or "Machine Learning")

gambler · on Oct 26, 2018

>We didn't reach the full potential of neural networks

We didn't reach the full potential of random forests or rule-based systems either. That's the problem with AI "seasons" and hype. Research being driven by things that have little to do with its subject.

candiodari · on Oct 26, 2018

I think the reason humans see movement is actually CNNs. We don't actually detect movement with our eyes, just like a still camera doesn't. We just "convolve" a few frames over time.

And if an image shifts + the 16 pixels or so under consideration "look like" something that ought to move, guess what, the movement signal goes beep.

Which is presumably why our optical sensing system goes to such great lengths to keep the image on our eyes stationary.

SilasX · on Oct 26, 2018

Relatedly (I think), the snakes stop moving if you stare at it with that zoned-out, glazed-over, I'm-not-listening look. (At least for me.)

darepublic · on Oct 26, 2018

Quite skeptical about this, someone saying 'I used an NN and got this result (which I wanted to get)' is pretty easy to manipulate/fake imo.

edoo · on Oct 26, 2018

The AI would have to be processing at a high enough level of abstraction to be fooled the same way. Basic image recognition is... basic.

Check out this adversarial attack on an image classifier. They turn a panda into a gibbon with a specially designed map that looks like noise at first without noticeably modifying the image for humans.

https://blog.openai.com/adversarial-example-research/

plopilop · on Oct 26, 2018

From my layman point of view, this can lead to two different conclusions:

* The DNN is actually getting close to the way human brain works, as it falls into the same mistakes as the human eye;

* There is some kind of universality in these illusions, that every (or most of) vision systems will fall for.

But honestly I think that the second conclusion is a rewriting of the first one (what is a vision system? What is this universal property?).

Again I'm absolutely not an expert of the field so please correct me if I'm wrong.

candiodari · on Oct 26, 2018

Generally people say that the vision system is a "CNN", convolutional neural net. Now technically speaking a DNN is usually a 5 CNNs in series + another 50 or so "hidden layers" (or 500 if you're Google, or 10000 if you're the human brain), so it's somewhat ambiguous. But that's a "usual" thing. Deep neural nets used to be (20 years ago) anything with more than 1 hidden layer. These days, people say it's anything 50+ hidden layers, because nobody really uses 1 hidden layer for anything but education.

These networks types differ in what they compare to what. CNNs compare signals with their neighbourhood (e.g. the pixels in a 9x9 grid), and ignores the rest. DNNs compare every signal with every other signal. CNNs are much cheaper, especially now with dedicated hardware. Large DNNs cannot be accelerated much, and you cannot really build hardware to do that.

Also what distinguishes the optical subsystem of animal brains is the first stages are CNNs (well, it's a biological system, so the functions are spread around: the optical sensors in the eye themselves form a CNN, before even sending what they're seeing on, then there's some 5 layers of CNNs in the retina itself before it even gets to the optical nerve, which itself is also a CNN, and delivers the output of that into yet another set of CNNs in your brain).

randcraw · on Oct 26, 2018

I suspect most optical illusions are based on perceptual cues / heuristics learned by human brains in early childhood, like object size shrinking with distance, or occlusion showing which object is nearer, perspective, or a vanishing point giving structure to a scene composed of straight lines. So I doubt any of these cues will be learned by a deep learning net -- because they're not essential to learning the target objective efficiently.

So no, I suspect AI is unlikely to be fooled by anything other than tricks based on the most obvious visual cues (like perceiving that two humans of greatly different size must be different distances away).

[OK, now I've read the article.]

The article doesn't say what the training objective was for the net. If it was the ability to predict the perceived direction of rotation for a propeller, then it should be trivial to train the net to predict rotation in a specific direction. (Only one of two binary outcomes is needed to declare victory.)

More specifics are needed on the training process (esp the objective(s) and control images) than the OP article provides.