Overfitting has nothing to do with it. See this paper: http://arxiv.org/abs/1412...

compbio · on March 23, 2015

I was not talking about overfitting. I've seen that paper.

The original paper asked if images that could fool DBN.a could fool DBN.b. The answer was: certainly not all the time. They used the exact same train set and architecture for DBN.a and DBN.b, just randomly varied initial weights. I think this is too favorable for a comparison with a voting ensemble made with nets with a different architecture, train set and tuning. Can they also find images that can fool DBN.a-z?

Also, to test if a net can learn to recognize these fooling images, they simply add them to the train sets. Those noisy images would be far simpler to detect: They have a much greater complexity than natural images. To detect the artsy images, a quick knearest-neighbors run should show that they do not look much like anything it has seen before, so it may be an adversarial image.

Houshalter · on March 23, 2015

To be clear I meant this paper (http://arxiv.org/abs/1312.6199) as the original paper for adversarial images. I think they did try transferring them between very different NNs:

>In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

compbio · on March 23, 2015

Very interesting. Thank you for correcting me.

a relatively large fraction of examples will be misclassified by networks trained from scratch with different hyper-parameters (number of layers, regularization or initial weights). The above observations suggest that adversarial examples are somewhat universal...