Please correct me if I am interpreting this incorrectly. I read the paper and it sounds like you retrained the softmax layer on Inception to classify the 3-D printed turtle as a rifle. In that case, you would have overwritten Inception's original representation of what a rifle looks like. Did you test out what would happen if you put a picture of a rifle in front of the camera? How would the network now classify the rifle?
>given access to the classifier gradient, we can make adversarial examples
It seems like they are finding little "inflection points" in the trained network where a small, well-placed change of input can flip the result to something different. With the rise of "AI for AI", I imagine this is something that could be optimized against.
In the turtle example, it seems that google's classifier has found that looking for a couple specific things (mostly a trigger in this case) identifies a gun better than looking for the entirety of a gun. Perhaps optimizing against these inflection points will force the classifier to have a better understanding of the objects it is classifying and lead to better results in non-adversarial situations.