One problem with training a neural network end-to-end this way is that the system is susceptible to unpredictable glitches: The same principle that lets people trick a NN into [thinking a panda is a vulture](https://codewords.recurse.com/issues/five/why-do-neural-netw...) can happen randomly just by differing lighting/shadow conditions, sun glare, or who knows.
One can always train the network with more and more scenarios, but how do you know when to stop? How good is good enough in this regard?
For the "thinking a panda is a vulture" problem, don't humans fail in similar ways? The analogous examples for us are camouflage, optical illusions, logical fallacies, etc.
It doesn't really have to be perfect as long as it doesn't fail in common scenarios.
Humans don't appear to fail in the same ways - camouflage and optical illusions are very different to the specific imperceptible-to-humans changes that trick neural networks. Then again, there's no way to test the method on humans because you need to know the neural network weights and that is tricky for people!
In practice it probably doesn't matter anyway - the chance of the exact required perturbation of the input happening by chance are infinitesimal, due to the high dimensionality of the input. And even if it was a problem there are ways around it.
For the "thinking a panda is a vulture" problem, don't humans fail in similar ways?
This is a good question. My impression is that humans fail and artificial neural networks fail but we don't know enough about the brain to say artificial neural networks fail in the same way as humans.
As another poster notes, humans accept human error more than computer error and I think that's because humans have an internal model of what other humans will do. If I see a car waving in a lane and going slowly, I have some ideas what's happening. I don't think that model would extend to a situation where neural network-driven car was acting "wonky".
It should never fail since any failure could potentially create a fatal scenario. People usually accept fatalities because of human error but they won't accept death because of algorithmic failure.
I suspect that it won't take long for people to come to terms with it in the same way we now "accept" industrial accidents. "Accept" in this case simply means that the industry in question is allowed to continue doing business.
That's an unattainable high acceptance bar. A more reasonable one would be to have mass adoption of self driving cars as soon as self driving cars cause less accidents than human drivers.
Not every car crash ends in death. But the AI will learn a lot from each crash. I think mistakes and 'bugs' in the system will get ironed out at low speed crashes and in high speed crashes on test circuits...
Have you seen the AI Formula 1 called roborace? Once those cars get good enough to beat Lewis Hamilton or Seb Vettel I'll trust it with me and my family.
Do people accept death due to autopilot error in aeroplanes? It's the same thing. There has been no demands for autopilot to be removed from planes or mass refusal to fly. The reason is that most people can see that autopilot is an overall safety gain compared with getting a human to concentrate on the same thing for long periods of time.
> It doesn't really have to be perfect as long as it doesn't fail in common scenarios.
i agree that it doesn't have to be perfect, but the standard should be higher than "doesn't fail in common scenarios." we should also expect graceful handling of many uncommon but plausible scenarios. we expect human drivers to handle more than just common scenarios, and human drivers are pretty bad.
The adversarial examples are so weak that they disappear if you give the CNN even some attention or foveation mechanisms (that is, they work only on a single pass). How much effect are they going to have on a CNN being used at 30FPS+ to do lane following under constantly varying lighting and appearances and position? None.
Are you referring to this foveation paper (http://arxiv.org/abs/1511.06292)? I'm quite skeptical of the claims in that paper; upon closer reading their experiments are problematic. Also, it appears the paper was rejected. I can elaborate if that is indeed the case.
I'm wondering whether adversarial examples can also be found for autoencoders to the same extent. It seems very intuitive that you can overstep the decision boundary that a discriminatory network learns by slightly shifting the input into the direction of a different, nearby label.
Yes. And rejection means little. The point is that adversarial examples have to be fragiley constructed to fool on one single example for one forward-pass. There is no evidence that any adversarial examples exist which can fool an even slightly more sophisticated CNN, fool a simple CNN over many time-steps, fool a simple CNN for enough time-steps to lead to any noticeable differences in action, fool a simple CNN for enough time-steps to lead to a noticeable difference in action which could lead to an accident, or fool a simple CNN for enough time-steps to lead to a noticeable difference in action which leads to an accident frequently enough to noticeably reduce the safety advantages.
The paper was rejected (you can read the ICLR comments) because the experiments did not really support their point. And I agree. The gist of the experiments they ran to support their thesis was to take a CNN and construct adversarial examples that sucessfully fooled it. They then applied foveation, and showed that the CNN was no longer fooled. Which is obvious! It's kind of obvious to me that adding preprocessing that the attacker is unaware of would be able to beat the attacker. What they didn't do is regenerate the adversarial examples assuming the attacker has knowledge that the target was using foveation.
There are no experiments that support your statements, unfortunately.
The examples where this happens have always seemed fairly weak to me. How many of the grave errors, not just where it's the wrong type of animal or container but actually thinking it's radically different, survive an application of Gaussian blur? Furthermore self-driving cars are a combination of signals; you are going to need to simultaneously fool both LIDAR and cameras.
On top of that you are going need to fool them over multiple frames, while the sensors get a different angle on the subject as the car moves. For example in the first Deep Q-learning paper, "Playing Atari with Deep Reinforcement Learning"[0], they use four frames in sequence. That was at the end of 2013.
I don't think anyone will be able to come up with a serious example that fools multiple sensors over multiple frame as the sensors are moving. Even if they do then inducing an unnecessary emergency stopping situation is still not the same as getting the car to drive into a group of people. Even if fooled in some circumstances the cars will still be safer than most human drivers and still have a massive utilitarian moral case in relation to human deaths, on top of the economic case, to be used.
The fooling of networks is still an interesting thing, but it's been overplayed to my mind and is not particularly more interesting than someone being fooled for a split second into thinking a hat stand with a coat and hat on it is a person when they first see it out of the corner of their eye.
1. Gaussian blur is just a spatial convolution (recall from signal processing). If a network is susceptible to adversarial examples, it will still be susceptible after a Gaussian blur (assuming the adversary knows you're applying a Gaussian blur. If the adversary doesn't, that's just security by obscurity, and they'll find out eventually).
2. A sequence of frames does not solve the issue because you can have a sequence of adversarial examples (although it would certainly make the actual physical process of projecting onto the camera more difficult, but not really any more difficult than the original problem of projecting an image onto a camera).
3. Using something conventional like LIDAR as a backup is the right approach IMO, and I totally agree with you there. But Tesla and lots of other companies aren't doing that because it's too expensive.
1. If that's the case perhaps another kind of blurring? "Intriguing properties of neural networks" (https://arxiv.org/pdf/1312.6199.pdf page 6) has examples where you get radically different classifications that I don't think would occur naturally or survive a blur with some random element, let alone two moving cameras and a sequence of images. As the title says it's an intriguing property, not necessarily a huge problem.
2. I honestly can't think of a situation where this could occur. It's the equivalent of kids shining lasers into the eyes of airline pilots, but the kids need a PhD in deep learning and specialised equipment to be able to do it. A hacker doing some update to the software via a network sounds much more plausible than attacking the system through its vision while it's traveling.
3. This is the real point in the end I guess, this Google presentation (https://www.youtube.com/watch?v=tiwVMrTLUWg) shows that the first autonomous cars to be sold will be very sophisticated with multiple systems and a lot of traditional software engineering. Hopefully LIDAR costs will come down.
1. Those are examples for a network that does not use blurring. You have the be careful because, remember, the adversary can tailor their examples to whatever preprocessing you use. So the adversarial examples for a network with blurring would look completely different, but they would still exist. Randomness could just force the adversary to use a distribution over examples, and it could mean they are still able to fool you half the time instead of all the time. However, I wouldn't trust my intuition here: that is really a question for the machine learning theory researchers (whether there is some random scheme that is provably resilient or if they're all provably vulnerable, or proving some error bounds on resilience, etc.).
2. The problem of projecting an image onto a car's camera already implies you'd be able to do it for a few seconds.
"It unlikely to happen" is not a good strategy to rely on with systems operating at scale. There are about a billion cars on earth traveling trillions of miles every year, many of which will eventually be self-driving. At that scale, you don't need a malicious actor working to fool these systems, you just need to encounter the wrong environment. And even if the system is perfect on the day it's released, that doesn't mean that it will remain so indefinitely (even with proper maintenance).
Studying induced failure in neural networks may help us understand the failure modes and mechanisms of these systems.
I haven't seen a paper that shows this "tricking" can be used as a real world attack or happen randomly. Just because you can compute an input that has this unusual behavior doesn't mean there is a demonstrably nonzero probability of it happening.
Wait? That's exactly what it means. Since the networks are not "continuous" you can't reason about how the system will behave in actual real world conditions because any random fluctuations can cause the whole thing to malfunction. I put continuous in quotes because it's not the real definition of continuous like in real analysis but a good enough analogy as in small variations in input should not lead to wildly different outputs.
This is why any model that lacks explanatory power can't be used in mission and safety critical systems. If it can't reason about things the same way people can reason about things then the system overall can't really be trusted. It's one thing when a translation from english to spanish is wrong, it's a completely another thing when the control software of a self-driving car decides to accelerate instead of break and the root cause analysis is people throwing their hands up and saying neural networks are inherently susceptible to these kinds of problems.
To be fair, you should be more precise. The attacks are specifically calculated. The combinatorial space of possible inputs is so massive that I'm sure it is extremely unlikely for a malicious input to occur randomly.
I don't think it has to do with the combinatorics of the input space. Adversarial inputs are hard to generate until someone figures how to point a set of laser pointers at exactly the right spots on a truck on a highway to get it to swerve out of control.
This is not true at all. Adversarial input can indeed be probable input depending on your definitions and I haven't seen anything yet that describes the probability distributions of inputs. Everyone takes a bunch of training examples and extrapolates from there.
In theory, yes. In practice, we've built our roads, signals, and car interiors, tailored to our specific minds and concepts.
So when we see someone that just started driving perform well under some circumstances, we can good performance under circumstances that are similar to the human mind. The problem that the "fooling neural networks" experiments show is that two things that are similar for humans can be wildly different for a NN that's been trained to recognize them.
You use a test set of scenario's on which you don't train but only measure effectiveness. When accuracy on the test set exceeds your chosen threshold, that is good enough.
What is the accuracy of the human brain in recognizing traffic situations? It is probably not that hard to get a NN to do better, even if periodically it still causes an accident. This is the uncanny valley effect for self-driving cars. It's not enough to be better than average humans at driving, which i think they already are, they have to be perfect at driving for people to trust them.
The constant question for self driving cars is "How will we know when they are good enough?"
Is there any reason they couldn't just put a driving test examiner in the car and test it like you would a human? Just ask the thing to drive around town, emergency stop, park, navigate a roundabout etc.
Yeah, the driving test assumes you have human level cognitive function and can apply the demonstrated skills in a much much wider variety of situations than those that occur during the test.
One can always train the network with more and more scenarios, but how do you know when to stop? How good is good enough in this regard?