Ok, cool! I was actually one of the people on the hyprnote HN thread asking for a headless mode!
I was actually integrating some whisper tools yesterday. I was wondering if there was a way to get a streaming response, and was thinking it'd be nice if you can.
I'm on linux, so don't think I can test out owhisper right now, but is that a thing that's possible?
Also, it looks like the `owhisper run` command gives it's output as a tui. Is there an option for a plain text response so that we can just pipe it to other programs? (maybe just `kill`/`CTRL+C` to stop the recording and finalize the words).
Same question for streaming, is there a way to get a streaming text output from owhisper? (it looks like you said you create a deepgram compatible api, I had a quick look at the api docs, but I don't know how easy it is to hook into it and get some nice streaming text while speaking).
Oh yeah, and diarisation (available with a flag?) would be awesome, one of the things that's missing from most of the easiest to run things I can find.
Nice stuff, had a quick test on linux and it works (built directly, I didn't check out the brew). I ran into a small issue with moonshine and opened an issue on github.
Great work on this! excited to keep an eye on things.
Also had a quick play too. The TUI is garbled thanks to some stderr messages which can just be dev/null'd. I don't seem to be able to interact with the transcripts with the arrow or jk keys.
Overall though, it's fast and really impressive. Can't wait for it to progress.
Can you help me out to find where the code you've built is? I can see the folder in github[0], but I can't see the code for the cli for instance? unless I'm blind.
I was actually integrating some whisper tools yesterday. I was wondering if there was a way to get a streaming response, and was thinking it'd be nice if you can.
I'm on linux, so don't think I can test out owhisper right now, but is that a thing that's possible?
Also, it looks like the `owhisper run` command gives it's output as a tui. Is there an option for a plain text response so that we can just pipe it to other programs? (maybe just `kill`/`CTRL+C` to stop the recording and finalize the words).
Same question for streaming, is there a way to get a streaming text output from owhisper? (it looks like you said you create a deepgram compatible api, I had a quick look at the api docs, but I don't know how easy it is to hook into it and get some nice streaming text while speaking).
Oh yeah, and diarisation (available with a flag?) would be awesome, one of the things that's missing from most of the easiest to run things I can find.