It's in fact using Silero via RealtimeSTT. RealtimeSTT tells when silence starts...

thekaranchawla · 2025-05-06T17:25:29 1746552329

This is the exact strategy I'm using for the real-time voice agent I'm building. Livekit also published a custom turn detection model that works really well based on the video they released, which was cool to see.

Code: https://github.com/livekit/agents/tree/main/livekit-plugins/... Blog: https://blog.livekit.io/using-a-transformer-to-improve-end-o...