Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How's the "accent conversion model" work? Is it all embedding based?

If so—and if you want to transfer-learn new downstream models from embeddings—then seems to me you are onto a very effective way of doing data augmentation. It's expensive to do data augmentation on raw waveforms since you always need to run the STFT again; but if you've pre-computed & cached embeddings and can do data augmentation there, it would be super fast.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: