Hacker News new | past | comments | ask | show | jobs | submit login

I would be greatly interested in knowing how you set all that up if you felt like sharing the specifics.



My hope is to make this easy with a GH repo or at least detailed instructions.

I'm on a Mac and I found the easiest way to run & use local models is Ollama as it has a rest interface: https://github.com/ollama/ollama/blob/main/docs/api.md

I just have a local script that pulls the audio file from Voice Memos (after it syncs from my iPhone), runs it through openai's whisper (really the best at voice to speech; excellent results) and then makes sense of it all with a prompt that asks for organized summary notes and todos in GH flavored markdown. That final output goes into my Obsidian vault. The model I use is llama3.1 but haven't spent much time testing others. I find you don't really need the largest models since the task is to organize text rather than augment it with a lot of external knowledge.

Humorously the harder part of the process was finding where the hell Voice Memos actually stores these audio files. I wish you could set the location yourself! They live deep inside ~/Library/Containers. Voice Memos has no export feature, but I found you can drag any audio recording out of the left sidebar to the desktop or a folder. So I just drag the voice memo into a folder my script watches and then it runs the automation.

If anyone has another, better option for recording your voice on an iPhone, let me know! The nice thing about all this is you don't even have to start / stop the recording ever on your walk... just leave it going. Dead space and side conversations and commands to your dog are all well handled and never seem to pollute my notes.


Have you tried the Shortcuts app? On phone and mac. Should be able to make one that finds and moves a voice memo when run. You can run them on button press or via automation.

Also what kind of local machine do you need? I have an imac pro, wondering if this will run the models or if I ought to be on an apple silicon machine? I have an M1 macbook air as well.


You could also use the "share" menu and airdrop the audio from your iphone to your mac. Files end up in Downloads by default.


Amazing, thank you for this!


You can record your voice messages and send them to yourself in Telegram. They're saved on-device. You can then create a bot to do things to stuff as they come in, like "transcribe new ogg files and write back the text as a message after the voice memo".


Thanks for this!




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: