Hacker News new | past | comments | ask | show | jobs | submit login

I do not use the wav2letter@anywhere inference frontend - I trained the acoustic model using the facebook upstream code, but the decoder is almost entirely new, and on Windows I use Pytorch for inference.

Talon ships with a libw2l.so/dylib on Linux/Mac built from my open source repos.




Feels like wav2letter will not be actively developed anymore. Understandable since it is hard to compete with Pytorch with a custom NN toolkit. Any plans to move to Pytorch/Tensorflow?


I don’t think that’s right at all. They moved development to the Flashlight repo. It seems very actively developed to me. Last commit 3 hours ago. wav2vec 2.0 blog post also went up last month (September) and the current state of the art iirc is a Google model based on Facebook’s Wav2vec 2.0 work.

For my own use I’ve already built Pytorch and CoreML frontends, with a shared model format (can convert models to/from wav2letter format and my custom format), and I have the ability to create new models in these frameworks from wav2letter architecture files.

I still run my training in the wav2letter framework, but for compatible training in Pytorch I would mostly just need criterion implementations. I assume warpCTC is fine for the CTC models. There’s also a third party Pytorch ASG criterion package but I haven’t tried it yet.


Interesting. What model are you using for your acoustic - the streaming convnet?

I didn't know that there was a pytorch implementation of the w2l architectures..




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: