Like others are saying, it's just much harder to use. The official tutorial even...

Like others are saying, it's just much harder to use. The official tutorial even says "The intended audience for this tutorial is either speech recognition researchers, or graduates or advanced undergraduates who are studying this area anyway." in the first paragraph. It seems like Kaldi is meant for people who actually know how speech recognition works, while other tools are meant for people who just want some text from some audio without really understanding how.

For example, I've been playing with home automation and speech recognition, and have been able to get any Sphinx based recognizer working in a single sitting, in a few hours or less. But I've yet to get Kaldi working yet after a several nights of effort. It seems much more powerful, and based on my reading, it's more accurate than Sphinx. But that doesn't do me any good if I can't get it to run, haha.