Hacker News new | past | comments | ask | show | jobs | submit | ggulati's comments login

If you use Sphinx for speech recognition and use pyttsx for text to speech (Windows Speech API, OSX NSSS, or ESpeak on Linux) it all works offline - see the "Jarvis's Brain" section.


Most of the stuff I found was for Python 2.7! I'll edit that into the post. My focus was for finding libraries that worked with new Python code, e.g. Python 3.5 code.

All of those libraries have Python 2.7 versions. Actually for all of them you pip install the same library; for pyttsx, `pip install pyttsx` and ignore jpercent's update.

I'm not sure what you mean about pricing and testing for development. Are you referring to Google's services? They offer 50 reqs/day for voice recognition on a free developer API key (https://www.chromium.org/developers/how-tos/api-keys). Google Translate can also be used by gTTS; it will rate limit or block you if you send too many reqs/min or per day without an appropriately registered API key, but you could play around with it for sure.

If voice recognition is important, it might be worth investigating Sphinx more and putting the time to tweak their English language model files. Synthesis is more difficult, though I think the Windows SAPI, OSX NSSS, and ESpeak on *nix are all "good enough." There are also a range of commercial libraries.


I too thought it was Python 3 only before I read it. Maybe a better title would be "Coding Jarvis in Python in 2016" and then explaining in the first paragraph that this is Python 2 and 3 compatible, with your personal focus on 3?


Thanks for the feedback; I updated the blog post.


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: