Is there something particular about Vietnamese that makes it easier than other languages for dictations? Like, does it lack homophones, are letters always pronounced the same way? 100% correct sounds incredible.
English is highly non-phonemic the only language I would think is worst is French/Greek with silent letters and the different accents and in Modern Greek /i/ can be written in six different ways: ι, η, υ, ει, οι and υι. My Italian and most Eastern European languages would be easier European languages for voice translation.
This seriously reminds me of my days in college and grad school learning/teaching dead languages. The dead languages (Latin, ancient Hebrew, Classical/Koine Greek and Aramaic) we really don't know how they were pronounced so we just made them phonetic, which tells you that we don't speak them correctly since no language is 100%.
Vietnamese writing is pretty much phonetic. Technically, you could learn the alphabet and read Vietnamese almost perfectly without prior understanding of word pronunciation (obviously there are some exception, but there much much less variation than English). Also, different regions in Vietnam pronounce certain alphabet letters differently, but within-region per-letter pronunciation is very similar.
I suspect Vietnamese tones (6 tones in total) give an stronger, more consistent, "orthogonal" signal that helps speech-to-text. For example, a word spoken with an "up" tone will always be spoken with an up tone. Whereas in English, depending on the word's position in a sentence, the speaker's emotion, etc, you might have great across-word tonal variation, even within the same speaker's multiple utterances of the word.
When you listen to Vietnamese it's very "short and choppy" with a lot of tonal variation. I suspect that the recurrent neural nets Google is training for speech recognition purposes have better inference over these separable, tonally-consistent utterances compared with harder to separate and highly variable English speech.
My wife just spoke full-speed Vietnamese into a Google Doc, 100% correct, even the tone marks.