Almost certainly, but this is true of most low bitrate codecs. I've got a very deep voice and it becomes largely unintelligible in marginal mobile signal conditions. If anything this one might be more tweakable and/or personalizable than what we use today.
To some extent, surely. In their samples they have some music and some audio with background noise. The music survives ok and the clanging of the background noise is reduced to clicking so maybe languages with clicking sounds do ok too.
I suppose there are a few futures:
- the paper was very innovative but nothing really happens in the ‘real world’
- Google roll it out globally to one of their products and we discover that some voices/accents/languages work poorly (or well)
- The same but with slow rollout, feedback, studies, and improvements to try to make sure everyone gets a good result.