There is a similar attack against image classifiers.
So is this just a proof of concept or can this be exploited in the wild?
Based on my understanding you need to have access to network itself (weights, baisses, activation function, architectural topography) to pull off this kind of attack. Doesn't seem like this could be easily be duplicated as an outside agent.
To generate adversarials, you really just the Jacobian of the network, which, without the network architecture, etc., would require estimation via finite differences which would require running the same signal as input multiple (lots! for large signals) times with minor perturbations to each component. I think whether or not this is feasible depends on the circumstances. Also, I could be wrong about everything because I'm not super current on this stuff.
"Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN"
I wouldn't be surprised if these speech to text models are susceptible to black box attacks in the same way that image classifiers are:
https://arxiv.org/abs/1605.07277
That's the purpose of magic[1], isn't it? Maybe one day magic tricks will be designed by AI. It'd be funny, since that would gave Arthur C. Clark famous quote ("any sufficiently advanced technology...") a whole new meaning.
I'm pretty sure some alarms use a strobe light at a particular frequency combined with a loud 2kHz tone to discombobulate intruders. That's more sensory overload than an adversarial signal, though.
You could argue that cartoons or line drawings are "adversarial examples" that are reliably identified by the human visual system as representing a particular thing despite being quantitatively nothing like what they're meant to represent.
Adding noise to a signal to cause it to be perceived unintuitively. It's possible that, as others pointed out, optical illusions are an example of this. But in actuality, if such a thing existed, I'm not sure we'd be able to tell it's adversarial.
It won't work on google phone right now, from the paper: "The audio adversarial examples we construct in this paper do not remain adversarial after being played over-the-air, and therefore present a limited real-world threat; however, just as the initial work on image-based adversarial examples did not consider the physical channel and only later was it shown to be possible, we believe further work will be able to produce audio adversarial examples that are effective over-the-air."
However prior work [1] was shown to work on google phones, it is just much more noticeable.
I believe this requires the attacker to have access to the ASR neural net's weights, so Mozilla's seems like the only popular framework that's vulnerable right now (not that I'm opposed to them keeping things open).
This is exactly why we need to keep ASR tech open. These kinds of issues affect DNNs in general, and other ASR engines based on them are also vulnerable. Having models and the data used to train them open is a great way to help academia make progress in this space.
Adversarial examples don’t actually have to be generated with the same neural network. If you want to attack another NN, you can make your own with similar training data and similar structure and the attacks can carry over to some extent.