Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Kokoro is fine for TTS, but it lacks emotion. But for a model of this size, that is kind of given.


Ironic given the name: kokoro is Japanese for heart or sentiment.


I played with ebook generation a bunch and find that (at least for English text) around 1B is needed to get something usable emotionally (Chatterbox is 0.5B, Orpheus is 3B).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: