Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tell HN: OpenAI Whisper Is Overfitted
16 points by lostmsu on Dec 8, 2022 | hide | past | favorite | 6 comments
I wired it up to transcribe song lyrics from my local recordings. For Rammstein's Deutschland it produced English translation (the song is sung in German)!

Along the same lines it sometimes adds a comment at the end of the transcribed text that reads "lyrics was contributed by XXX", which is obviously not being sung - it hallucinates it from the dirty training data.



I did notice this too. I tried to transcribe part of a movie and in the intro section in spitted out "The translations have been done by www.??.com". They clearly trained these on public SRTs


Does Whisper ever respond with "I don't understand" or does it always try to amble on and give an answer, no matter the confidence of the prediction?


Isn’t it one key feature of Whisper to transcribe in English?


Yes, but it did not hear German and transliterated it to English. It behaved as if it "heard" English translation.


I assume you set the language flag correctly?


It has a mode where it detects the language.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: