Hacker News new | past | comments | ask | show | jobs | submit login

Amazing. I’ll see if I can get this working on Mac too. I have so many use cases for this.

30 years of audio that needs transcribing, summaries, and worksheets made out of them.




Whisper works ... kinda. I'm hoping there's another set of models released at some point, the error rate isn't appalling to me because i am transcribing TV shows and radio shows for personal use, so it's not mission critical.

There are a few whisper diarization "projects" but i've never been able to get it to work. Whisper does have word-level timestamps, so it should be simple to "plug in" diarization.

I don't need an LLM or whatever this project has, but i will see if it's runnable and if it's any better than what a couple podcasts i listen to use.

edit: see some people mentioning whisperx, which is one of those things that was cool until moving fast broke things:

>As of Oct 11, 2023, there is a known issue regarding slow performance with pyannote/Speaker-Diarization-3.0 in whisperX. It is due to dependency conflicts between faster-whisper and pyannote-audio 3.0.0. Please see this issue for more details and potential workarounds.

which means that what i gain is a ~3x increase in large-v2 speeds but i instantly lose those gains with diarization, unless i track down 8 month old bug workarounds.

I'll stick with the py venv whisper install i've been using for the last 16 months, tyvm


Re: Diarization, I had decent results with testing this on Colab a while ago:

https://github.com/MahmoudAshraf97/whisper-diarization

I remember having the usual python package hell when NeMo was updated somewhere, but it seems to be decently well maintained so give it a go.

*Edit, I remember reading somewhere that pyannote was a weak link in other repos, that might be why your other tests were not great.


I would love to hear more about your use case!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: