We're approaching a world where AI will be more and more relied upon in dangerous situations. Imagine someone getting killed for something ridiculous like inadvertently holding an adversarial example. Public trust would have a hard time recovering.
No, ironically because classifiers are trained to be robust to noise. Adversarial examples are generated by altering the original in one specific dimension of a high-dimensional vector space. Random noise is vanishingly unlikely to undo that transformation, so the classification will stay wrong.
I wonder how sensitive these results are. I mean, if you run one more learning round of the network, does the rifle turn to turtle in the eyes of the network immediately so that you would need to generate a new turtle for every single network? Or does that turtle look like rifle for all current neural networks? Or, most likely, somewhere in between?
Because if I understtod anything correctly, the methods tries to find the smallest possible changes that causes the network to make the incorrect classofication. And these smallest possible changes just might be very sensitive to whatever small random ripple weight changes a network has. But I am far from neural network expert, so I can't really answer.
We're approaching a world where AI will be more and more relied upon in dangerous situations. Imagine someone getting killed for something ridiculous like inadvertently holding an adversarial example. Public trust would have a hard time recovering.