Just wrapped up a live translation feature that watches an HLS live stream, live...

mjfisher · on July 24, 2023

That's really cool. Managing ports is something I've done in elixir yet, but managing ffmpeg through elixir could be very useful for a few things in my current area of focus.

Can you share any details of how that works? E.g. do you operate on a single HLS chunk at a time, or can you get ffmpeg to separate a continuous audio stream, etc?

ricketycricket · on July 24, 2023

davidw is right that ports are somewhat limited, but I haven't had much trouble doing what I need with FFmpeg in particular. I used the bash wrapper from the docs for Port [^1] and it has worked well.

When a stream starts I start a supervisor that then starts a GenServer to manage the port. On init a port is started for FFmpeg (using the above bash wrapper) with args that sends 16-bit PCM audio back to the port through the `handle_info/2` callback.

When a new live HLS segment is downloaded by FFmpeg the entire segment's audio is sent to the GenServer all at once (could be a few handle_info/2 calls, but it happens quickly). Since I want to work in small fixed chunks, I send the segment's audio to an AudioBuffer GenServer (started as a sibling under the same supervisor). This buffer uses binary pattern matching to segment the audio in chunks exactly 2 seconds long while keeping any remainder in the GenServer's state for the next buffer event. I then send the chunks to another ChunkBuffer GenServer that pops chunks at 2-second intervals for processing.

Since everything is supervised, if (when...) FFmpeg crashes the supervisor just restarts it. Meanwhile, the audio in the buffer is still processing and nothing goes down. There might be a duplicate word or two in the transcription if the restarted port processes a segment again, but everything keeps running smoothly.

For even more reliability, I have the application running clustered across four locations in the US, EMEA, and APAC using libcluster[^2]. The stream supervisor is started under a Horde.DynamicSupervisor[^3] with a custom distribution strategy. The strategy prefers the region closest to the company HQ, but if it goes down, the processes will be restarted in another region.

[^1]: https://hexdocs.pm/elixir/1.13.4/Port.html#module-zombie-ope...

[^2}: https://github.com/bitwalker/libcluster

[^3]: https://github.com/derekkraan/horde

mjfisher · on July 27, 2023

Absolutely fantastic write up - thank you so much. I will go away and do some further reading!

davidw · on July 24, 2023

https://github.com/saleyn/erlexec is pretty good for handling external processes. The builtins aren't quite there if you have more complex use cases.