I recently had fun implementing [0] some of the concepts/math for ambiguous cylinders by Kokichi Sugihara. Some of his first illusions [1] came from researching computer vision, where the computer should understand how a 2d picture supposedly looked in 3d. When feeding it some optical illusions, he realised it would be possible to make these objects in real life!
The direct links to the stl files in github let you rotate it around and get some idea of how they actually work [1][2]. This is absolutely mind-blowing!
Sugihara just recently (again) won the Illusion of the Year contest. He has made lots of fun stuff over the years that's worth checking out! [2]
I was lucky to attend "FUN with Algorithms 2018" earlier this year, where he was one of the invited speakers. Seeing the illusions IRL blew my mind and I had to make them myself.
As far as I know, do humans perceive the rotating snakes illusion because our eyes quickly jump around (saccade). Stare at one point and it stops rotating. I skimmed the publication, but the word saccade is not in it.
I feel with this AI hype, we are desperate to find human like traits in these systems.
I'm rather young (25-30), so I didn't get to know the first "AI winter", but to me there is still potential for improvement of neural networks. Both hardware (new DSP cores) and theories related to NN (transfer learning, layers architectures) can be improved, and also new business applications who are not yet developed because of the impact on people's jobs. We didn't reach the full potential of neural networks, IMHO. Wait before we start connecting 100's of today's neural networks together and see what happen.
On a bigger scale of time, we could say the "AI winter" actually never existed, as all those last decades were dedicated to developing the hardwares and concepts we needed for today's algorithms. It was a long iteration of the loop 'hardware <-> algorithms <-> applications <-> new hardware needs'
"AI", as I imagine it, is not a product someone can sell. It's not tangible, it's hardly measurable, it's not only one technology, it's more a philosophical concept too me.
Edit : As for example, there is right now a link titled "Generating custom photo-realistic faces using AI" on the front page. It's misnamed, as an actual "AI" would probably not let itself be "used". The title would be better named "Generating custom photo-realistic faces using Neural Networks" (or "Deep Learning", or "Machine Learning")
>We didn't reach the full potential of neural networks
We didn't reach the full potential of random forests or rule-based systems either. That's the problem with AI "seasons" and hype. Research being driven by things that have little to do with its subject.
I think the reason humans see movement is actually CNNs. We don't actually detect movement with our eyes, just like a still camera doesn't. We just "convolve" a few frames over time.
And if an image shifts + the 16 pixels or so under consideration "look like" something that ought to move, guess what, the movement signal goes beep.
Which is presumably why our optical sensing system goes to such great lengths to keep the image on our eyes stationary.
The AI would have to be processing at a high enough level of abstraction to be fooled the same way. Basic image recognition is... basic.
Check out this adversarial attack on an image classifier. They turn a panda into a gibbon with a specially designed map that looks like noise at first without noticeably modifying the image for humans.
Generally people say that the vision system is a "CNN", convolutional neural net. Now technically speaking a DNN is usually a 5 CNNs in series + another 50 or so "hidden layers" (or 500 if you're Google, or 10000 if you're the human brain), so it's somewhat ambiguous. But that's a "usual" thing. Deep neural nets used to be (20 years ago) anything with more than 1 hidden layer. These days, people say it's anything 50+ hidden layers, because nobody really uses 1 hidden layer for anything but education.
These networks types differ in what they compare to what. CNNs compare signals with their neighbourhood (e.g. the pixels in a 9x9 grid), and ignores the rest. DNNs compare every signal with every other signal. CNNs are much cheaper, especially now with dedicated hardware. Large DNNs cannot be accelerated much, and you cannot really build hardware to do that.
Also what distinguishes the optical subsystem of animal brains is the first stages are CNNs (well, it's a biological system, so the functions are spread around: the optical sensors in the eye themselves form a CNN, before even sending what they're seeing on, then there's some 5 layers of CNNs in the retina itself before it even gets to the optical nerve, which itself is also a CNN, and delivers the output of that into yet another set of CNNs in your brain).
I suspect most optical illusions are based on perceptual cues / heuristics learned by human brains in early childhood, like object size shrinking with distance, or occlusion showing which object is nearer, perspective, or a vanishing point giving structure to a scene composed of straight lines. So I doubt any of these cues will be learned by a deep learning net -- because they're not essential to learning the target objective efficiently.
So no, I suspect AI is unlikely to be fooled by anything other than tricks based on the most obvious visual cues (like perceiving that two humans of greatly different size must be different distances away).
[OK, now I've read the article.]
The article doesn't say what the training objective was for the net. If it was the ability to predict the perceived direction of rotation for a propeller, then it should be trivial to train the net to predict rotation in a specific direction. (Only one of two binary outcomes is needed to declare victory.)
More specifics are needed on the training process (esp the objective(s) and control images) than the OP article provides.
[0]: https://github.com/Matsemann/impossible-objects [1]: http://www.isc.meiji.ac.jp/~kokichis/anomalousobjects/anomal...