The video is only 240p and quite shaky. As it is published by the SoundHound Inc. company, is this a marketing technique to make it look more amateurish?
Such a low latency means the demo was done over Wifi in the SoundHound building - especially if the speech recognition runs on the server side. Or which speech recognition software does that demo app use? Nuance software based on the client? Android 5 voice recognition isn't that fast.
That's insanely fast, compound natural language queries. I'm impressed.