Hacker News new | past | comments | ask | show | jobs | submit login

Since this is explicitly targeted at "the next billion users," do we have any sense of how well-optimized this is on non-English audio corpuses? I can't imagine that a model trained primarily on English/Western phonemes would perform as well on the rest of the world.



They say they tested it on 70+ languages.


Ah you're right. I couldn't find it on the original link in this post, but the post links to https://ai.googleblog.com/2021/02/lyra-new-very-low-bitrate-..., which mentions the 70+ languages statistic under the "Fairness" section. Thanks!


which is less than the number of spoken languages in India alone.


Wonder if India will ever go through a forced linguistic convergence like China did


Unlikely, there's too much pride in each local language. Might all converge on English over a couple of generations, though, but more for commercial reasons.


Or even in New York City public schools.


Yeah, I wonder about those weird languages with lots of clicks... (though they are probably not part of the next billion)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: