Hacker News new | past | comments | ask | show | jobs | submit login

Sphinx is pretty awful (remember the time before good speech recognition existed?). Alexa is far better.

Kaldi is much better, but very difficult to set up.

None of the open source speech recognition systems (or commercial for that matter) come close to Google.





Ah interesting link, I hadn't seen that.


> None of the open source speech recognition systems (or commercial for that matter) come close to Google.

Is that because of the data they have, or because of their superior algorithms?


It’s because they used data from the public to train their models.

If, suddenly, someone would apply the fact that copyright bans remixes to training of neural networks, and apply the fact that licenses for this have to be granted explicitly, Google would lose 90% of their advantage over other companies.

Personally, I’d be for making a requirement that companies open source their trained models if the training data contained data supplied by users, not paid employees.


I think the world needs the equivalent of OpenStreetMap but for speech data, so that the data is under a copyleft license that legally enforces reciprocation when the corpus is used or modified.


The closest is VoxForge:

http://voxforge.org/


Didn't know about VoxForge. Thanks!


I sympathize, but good luck with that :)


Both I think, but mostly the data. Baidu's deep speech is meant to be very good and its design is public (they even open sourced one component of it).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: