Hacker News new | past | comments | ask | show | jobs | submit login

My main issue with doing anything voice related was the last time I looked into using Pocketsphinx I needed to define terms/dictionaries to parse from.

I'd love to mix and match NPL libraries, voice synthesis, voice identification, and speech recognition to make a comfortable "User Interface" to some systems in my house.

I think it'd be a fun project, but nothing seems to be able to take arbitrary audio streams and give me a "User identification" based on voice patterns and also arbitrary spoken text.

I know, yes, this is a VERY tall order, but it something that should be possible. At the very least, the identification part isn't needed. It's just important that it works offline and provides a text stream.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: