Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
WebRTC for the Curious (2020) (webrtcforthecurious.com)
205 points by thunderbong on Jan 5, 2024 | hide | past | favorite | 53 comments


This is a really nice overview of the nuances of WebRTC.

If you want to try this tech at a higher level, I maintain an experimental hobby project that attempts to abstract all the complicated parts of WebRTC into a simple API. Most importantly, it obviates the need to run your own signaling server by piggybacking on various public protocols like torrent trackers or IPFS to match peers. It's well-suited for quick projects/prototypes.

https://github.com/dmotz/trystero/


Trystero is magic. I am baffled that it doesn’t have thousands of stars on GitHub. I’ve used it in several projects and it Just Works. Thank you for the incredible library dmotz!


Woah! This is very cool I don't think I have seen it before.

This will be really useful to establish a connection in adverse situations. Try all the different ways until you can get through :)


I haven't found a library that makes me feel comfortable with using WebRTC for random little projects. Setting up TURN/STUN is annoying and I've actually never successfully setup a server/client model (it seems like it'd be useful to have the fast UDP-like stuff, despite WebRTC mostly being about client/client communication)


> despite WebRTC mostly being about client/client communication

This is actually kind of a misconception, though it’s an understandable one given that WebRTC is almost always pitched as a peer-to-peer protocol.

In practice, most people using WebRTC for video are sending their video to a server, not directly to another client. It’s pretty safe to assume that most people who use your app are going to need TURN, and at that point, you’re not really doing peer-to-peer at all, so you might as well just have your browser-based app talk to a server that’s pretending to be another browser.

These servers (called Selective Forwarding Units or SFUs) can operate like a TURN server in the case of a one-on-one call, but they can also multiplex everyone’s feeds in the case of a larger conference (peer-to-peer 5 person calls would require each participant to send 4 copies of their video) and often have extra features like the ability to record calls, transcode streams or convert to other protocols.

The one I’ve used a lot is called Janus[0], it’s open source and has good docs, I recommend people check it out if they’re interested in getting deeper into WebRTC or other video streaming tech.

[0] https://janus.conf.meetecho.com


Another open-source SFU I've had great experience with is Livekit[0]. Great docs, modern, easy to deploy (it's a golang binary), and supports a number of egress options too if you want to record the media during a stream to an external system. With their cloud product they've also built a really cool 'mesh-based' SFU-CDN that allows peers to connect to an SFU closest to them and have their media broadcast to other SFUs internal to their network, which allows for easy scaling during broadcast-style use-cases.

[0] https://livekit.io/


I set up a simple website to stream two webcams in my chicken coop using Janus / WebRTC: https://github.com/dbrgn/chicken-coop/tree/main/rpi-image See README for a quick overview.

Another interesting SFU library is MediaSoup: https://mediasoup.org/


Janus really is one of those projects you can't believe is OSS. Great work.


I don’t remember where I read this (think it was some published paper) but I was building some audio streaming thing on top of WebRTC and there was an estimate that 60% of people would be able to do p2p.


I can definitely second janus. At previous job, we used that for video calls/streaming mixed with FFMPEG for some transformations along the way. Really reliable stuff.


What libraries have you used? There are a lot these days, maybe you haven't tried one yet that matches the way you think/code :)

* https://github.com/aiortc/aiortc (Python)

* GStreamer’s webrtcbin (C)

* https://github.com/shinyoshiaki/werift-webrtc (Typescript)

* https://github.com/pion/webrtc (Golang)

* https://github.com/webrtc-rs/webrtc (Rust)

* https://github.com/algesten/str0m (Rust)

* https://github.com/awslabs/amazon-kinesis-video-streams-webr... (C/Embedded)

* hhttps://github.com/sepfy/libpeer (C/Embedded)

* https://webrtc.googlesource.com/src/ (C++)

* https://github.com/sipsorcery-org/sipsorcery (C#)

* https://github.com/paullouisageneau/libdatachannel (C++)

* https://github.com/elixir-webrtc (Elixir)

See https://github.com/sipsorcery/webrtc-echoes for examples of some running against each other.


Whoa thanks!


Haven't used it yet, but if this library built on Cloudflare Workers can do for webrtc what PartyKit.io does for websockets (again via Cloudflare Workers), then I expect it'll be a pleasure to use :)

https://github.com/gfodor/p2pcf


Pion (https://github.com/pion/webrtc) works well and offers a good set of features.


I think cloudflare has TURN servers now? Might make it easier.


They made an announcement 2-3 years ago, but I've never seen anything released by them on that front.


it exists & it’s very good.

send a note to renan@.


I can recommend mediasoup: https://mediasoup.org/


I use PeerJS for a p2p browser based app that lets friends share a live AI session together.

HaveWords.ai


Related:

Show HN: Learn how WebRTC actually works. A book on the protocols, not just APIs - https://news.ycombinator.com/item?id=24323589 - Aug 2020 (72 comments)

Show HN: WebRTC for the Curious – Go Beyond the APIs - https://news.ycombinator.com/item?id=24283943 - Aug 2020 (1 comment)


> and will not be talking about any software in particular

If you want to do something with WebRTC, definitely use libwebrtc, and definitely get acquainted with the field trials, because that's how Google sneaks in all their best functionality.


If you want to do servers or embedded you shouldn’t. It makes sense to use in the places it was designed for though! If you are shipping a client and have needs that match its creators it’s the best choice.

What features of libwebrtc in particular would you like to see in other implementations. I am very excited to see a future with [0] on the client side.

[0] https://github.com/algesten/str0m


I tried reading that a few times but don't really understand it. Can you put it in terms of what I can't do without it, or can do with it? (I do have some long-reaching webrtc background but its pretty surface level and increasingly out-dated)


I work on str0m along my colleagues at Lookback. We use it it to build something akin to a SFU(Selective Forward Unit) i.e. a services that forwards media between many clients in a conference setting. There are other people using it for the same purpose. I think there are some folks trying it out for client-to-client use cases, the thing that's different about that is that more is required of the ICE agent to find a network path between the clients.

str0m doesn't deal with encoding or decoding media at all, but given that you provided those capabilities it should be possible to use it in e.g. a mobile app setting as part of a client implementation. This use case most likely has a few rough edges at the moment though.

If you, or anyone else, has further questions don't hesitate to join the Zulip chatroom and ask away.


The problem with libwebrtc is that it is really built to decode video frames being received from the network. Also the APIs are intended to accept full video frames to be encoded before being sent to the network.

This is great for a browser!

Not so much for an SFU or application that needs to work at the elementary stream (not uncompressed frame) level.


An interesting take for sure. Having worked with libwebrtc on multiple occasions I've found it pretty hard to separate from the Chrome build toolchain. Especially if you're doing server work there are pure Go and pure Rust implementations that work great.


How hard is it to implement a WebRTC peer from scratch (from socket calls)? If I don't want to support all profiles, let's say just unrelayed unencrypted data streams?

Is it "weekend project" or is it "reading the RFCs alone will take weeks -project"?


I think you can do it in a weekend, maybe two! You are going to implement the simplest parts. This is what it would look like. It also depends on what libraries you have available.

# Accepting an Offer

Parse the Session Description (remote's Offer). Grab the relevant values (ice username+password and certificate fingerprint)

# Generate your local state

Listen on a random UDP port for ICE. Generate a local ICE Username+Password. Generate a certificate for DTLS.

# Generate your Answer

Create an Answer that includes your IP, Port, ICE Username, ICE Password and Certificate Fingerprint.

# Connecting over ICE

Process the inbound STUN packets. Authenticate them with the remote ICE Username+Password. Respond with your local Username and Password.

# DTLS Handshake

Perform a DTLS handshake over the connection established via ICE. Make sure the remote certificate matches the one in the Offer.

# SCTP Association

Over DTLS create a SCTP Association. You now can exchange data streams!

-----

If you have a SDP, ICE, DTLS and SCTP library this should be ~250 lines of code. If you are missing any of those you will have to do a little more work. SDP and ICE could be done in a weekend. SCTP could be done in two weekends. DTLS I found a lot harder!

All of these estimates are for the simplest implementation. You could have years of work ahead of you depending on how far you want to take it :)


Yeah that make me want to stay away from this. Even for you, who obviously know more about ICE and SCTP than I do, this is a lot of components.


You could totally do it, but if it doesn’t sound interesting I get it :)

WebRTC is made up of composable parts which I really appreciate. Makes it seem more complicated though.


given the lack of number of implementations, i would wager that it is the latter than the former.


What language/ecosystem doesn't have a implementation where you want one?


I still have PTSD from Chrome blocking sleep on my Mac. Checking the power management status it was always "WebRTC has active peer connections". The only solution was to turn it off completely in Chrome's settings.

I don't know if it was a brain dead decision at Google or in the design of this protocol, but I'm associating it with lack of respect for the user.


Very nice, I've been waiting the longest time for something like this. I feel that WebRTC needs books like this, it's such an underrated technology.


I wish we replaced webrtc with something more modern and easy to use.


What about WebRTC do you find hard? I can connect two browsers and start a video call in ~50 lines of JS.

As you need more features/knobs it does get harder. I think that is just a reflection of how difficult the problem is of real-time communication though.


video conferencing seems to be the most wanted app, it involves too many parts to set up an sfu, hard to make it work with all the most popular devices , and i m not sure you can control the exact video resolution/bitrate. Mediasoup seems to have the most complete server app, but still making the whole thing work right is a lot of work. It seems like video meeting should be easy to use by now, it was easier when Flash was around.


> i m not sure you can control the exact video resolution/bitrate

You are right WebRTC doesn't give you control of this. It is on purpose though. If a webpage could push more bitrate then the link supported it would cause congestion collapse/be used for DoS attacks. You can request resolution/bitrate, but it will be best effort.

> it was easier when Flash was around

I didn't really use flash (was on Linux) so I can't speak to that. It was always buggy/hard to setup. So as a Linux user though I am super grateful for WebRTC!


The only thing I know about WebRTC is that I need to "Prevent WebRTC from leaking local IP addresses" in uBlock. This extremely limited experience has made WebRTC in my mind into a privacy risk. So thanks for this.


This has been addressed with mDNS host candidates in lieu of local IP host candidates.


It’s unfortunate that for browser based multiplayer gaming it’s not feasible to use something like WebRTC and we must stick to TCP based connections.


I am pretty darn sure it is feasible:

https://github.com/nathhB/nbnet

In fact, I am on the hunt for a minimal C library to do exactly this for a hobby project and nbnet is the best candidate I have found so far. The only drawback so far is that I would love to have voice support and nbnet (rightfully so to keep it simple) does not provide it and I would have to roll it on my own.


Really? WebRTC DataChannels seem like they could fulfill your needs. What is it that prevents using something like WebRTC Data Channels and forces you to use TCP based connections?


Agreed. I was using webrtc data channels for a messaging app and it’s great - the data sent could simply be multiplayer co-ord’s, actions etc instead.


Oh, you’re right!


So when I looked into it last month - its true DataChannels would be the answer. But I couldn't find any "nice" abstraction in the language i'm using for it. like I just can't get some standalone WebRTCDataChannels abstraction - many WebRTC libraries don't have implementation for this or its all very specialized for audio transfer. There is a route of using some standalone WebRTC gateway service that your application can send messages to (I think this is the purpose of Janus?) - blah blah point is: its not easily done. Not in Elixir so far anyways.

And the second part of it is - I'm sure its easily done but at this point there is some incentive to keep whatever implementation you have to yourself.

Or maybe I missed something and I should try looking into it again.


There is an Elixir version of WebRTC available. It is still early days, but your feedback/asks would be monumental in making it better!

https://github.com/elixir-webrtc


It is a minefield you’re right. I just implemented my own version of it with trial and error.


Agreed. I really looked into if WebRTC would be possible for the game I'm trying to make but currently its not as easy as people make it out to be. fortunately - the communication channel is easily abstracted, so I wait for the day i finally get the abstraction i want.


There's also WebTransport


the artifacts from UDP packet loss is sooo annoying with webrtc


Sounds like either NACKs are not getting through to indicate a need to retransmit, or should use a forward error correction (fec) extension.


I did not know that! thank you so much!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: