Hacker News new | past | comments | ask | show | jobs | submit login
Mediasoup – WebRTC video conferencing (github.com/versatica)
125 points by simonpure on May 18, 2020 | hide | past | favorite | 48 comments



How does mediasoup compare Pion project in terms advantages, performance and effort to build a simple conference app? Why I choose one over the other?


I've looked at WebRTC a few times, and every time I've been overwhelmed by the complexity of the protocol. I understand video streaming is hampered by codec patents, etc. but in 2020, the situation seems to be getting better with open source codecs unencumbered of patents (VP8,VP9, AV1, etc.) Why is the best we have still WebRTC? Seems it could be simplified as a protocol .. is this inherently hard or just a result of being designed in a committee?


It's way too complicated. I suspect it's some design by committee with large backer interests interfering.

WebRTC is a collection of underlying protocols, SDP, STUN, DTLS, RTP, SCTP. A superficial glance it seems to make sense, each of these RFCs provide some aspect needed in WebRTC.

However. These standards are from a naive happy time when the internet was open and routable, which means it's only some subset of said standards needed. The main WebRTC RFC fails at pinning down which, so it's down to the implementations to find some happy subset that works.

Trying to implement it is so frustrating. At every corner you follow links to some underlying RFC, start reading and coding only to realize "is this even used?!"

SDP is maybe the single worst thing in this mess. It's a terrible flat file description of structured data organized differently depending on "plan-b" or "unified". It would be super easy to convey what SDP tries to convey in any purpose built format.

On a conceptual level there are too many abstractions in the API. MediaStream and RTPTransceiver are my two pet peeves. MediaStream is maybe nice in client code to group some tracks together into a player, but the abstraction should stay on that level. RTPTransceiver is just beyond me. Why do I want it? How does this help?


It looks like RIPT[0] is people's answer to WebRTC's complexity.

I personally like WebRTC. Maybe just Stockholm syndrome though :) I see everyone saying QUIC is the answer, but all the complexity scares me. I imagine in 5 years everyone will miss how WebRTC is built with small building blocks.

WebRTC also bridges with a lot existing telephony stuff, which is nice! Since it talks RTP/SRTP I see a lot of people wiring it up to their older systems which is kind of cool!

[0] https://tools.ietf.org/html/draft-rosenbergjennings-dispatch...


You complain about SDP and about RPTTransceiver. But RTPTransceiver comes from ORTC, which was exactly introduced to avoid messing with SDP all the time. And you need it to specifiy simulcast layers. You don't need to use it.


I think you're misunderstanding what WebRTC is. It's only loosely related to specific video codecs. It can carry H.264, VP8, and VP9.


I think the main problem is that webRTC started from VOIP standards. That's why we're stuck with SIP and thousands of RFC's that seem very daunting for a newcomer to webRTC. I guess that's also why so many webRTC developers seem to come from spain (they were always big in the VOIP world with companies like telefonica).

It's getting slowly better since the standard introduced concepts from ORTC.


I thought that this news was about mediasoup, not about generic WebRTC questions.


Can someone explain how mediasoup differs from the RTCPeerConnection object (and related events) discussed in this Mozilla webrtc tutorial [0]? Given that it uses c++ I figure that mediasoup is more than just a wrapper around this? Any usability or reliability benefits of either?

[0] https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/...


mediasoup (server side, so the Node + C++ component you mean) does not implement "RTCPeerConnection". That's just needed for browsers. In mediasoup we don't use SDP but specific RTP parameters as defined here:

https://mediasoup.org/documentation/v3/mediasoup/rtp-paramet...

If you want to know why we don't use SDP (as communication means between client and server) here a good reading: https://webrtchacks.com/webrtc-sdp-inaki-baz-castillo/


Ah, okay, whenever I think webrtc I assume p2p with no server but I am now actually reading into what SFU is, etc. Makes sense. Thanks for pointer to these resources.


What kind of hardware would I need to setup this to run a private video chat server for say 10 users?


10 users? the cheapest one.


What advantages would this offer over jitsi meet?


Jitsi meet is a conferencing app. You likely mean jitsi videobridge. That's the SFU part and comparable to mediasoup.

Mediasoup has a bit more modern codebase and offers a rather low-level framework to build your own SFU. Whereas Jitsi videobridge is more of a ready-to-go SFU, but less flexible.

Mediasoup has very good node bindings, which may or may not be an advantage to you.

They offer similar (good) performance, although Mediasoup has a slight edge here. They're both very actively being kept up to date with the latest standards (in contrast with Kurento which is now as good as dead after Twilio bought the team). This is very important since both the spec and browser implementations are a fast moving target.

Disadvantage of mediasoup is that it is mainly maintained by just 1 or 2 persons and not yet used as much as Jitsi, so it's a bit of a gamble to start building your product on top of that.


Yep, two active developers but being just a set of libraries it's good enough. We also get nice contributions (C++ fixes and optimizations) via PR in GitHub. And we use mediasoup in different commercial products.


Kurento is not dead. Main contributions are on OpenVidu project, that somehow is part of Kurento (but with other name)


Jitsi is a full application (web app, backend servers) with a specific use case: meetings (similar to Zoom or Google Meet).

mediasouop is not an application but a set of server and client low level libraries to build whichever kind of audio/video applications (not just meetings). You don't "install mediasoup and configure it". You create your Node app and integrate mediasoup as you do with any other NPM dependency. Same in client side. More here:

https://mediasoup.org/documentation/overview/

Of course, this means that you must build your application, including UI, client-server signaling, etc etc.


> ...build whichever kind of audio/video applications...

Is live streaming a use-case that's under scope? If so, can the client P2P or requires a relay server for all traffic?


Well, it's webRTC. So the main use case is live video streaming. But one would need to define 'live'. webRTC is really made for sub-second latency, which you need for conversations. If you don't require this you're better of using HLS streaming. Because achieving ultra-low latency does come with tradeoffs in complexity and quality.

webRTC is peer 2 peer, but that doesn't work if you have a lot of peers. That is where an SFU like mediasoup comes into the picture. That's a kind of relay server so you can send to many peers still over webRTC (thus with ultra low latency). Also, if the peers are behind firewalls, peer 2 peer also doesn't work and you need a TURN server to relay the video.


PornHub uses mediasoup for live cams.


That’s quite the endorsement.


exactly what I was thinking :)


I think there's nothing stopping you from attempting that, but you would need some pretty complex client software to get a good experience with P2P live streaming.


You can use WebRTC for emitting raw data p2p, not just audio/video streams, right? Or would websockets be preferred if you wanted to, say, broadcast JSON objects to whoever was listening?


Yes, you can use WebRTC DataChannels for sending custom text/binary data on top of a ICE+DTLS connection. BTW mediasoup supports DataChannels.

Any question or comment about mediasoup?


What advantages does sending data over WebRTC have over sending data over websockets?


DataChannels are transmitted over the same UDP/ICE "connection" that is used to transmit audio and video packets. So if you plan to send real-time data (for example: real-time subtitles, metadata related to the current video position, etc) by sending such a data over DataChannel it will reach the remote without delay over the audio/video. If you use WebSocket to transmit the data, there may desynchronization between audio/video and data because they use a different network path.


You said it yourself: it can be done peer 2 peer. A websocket will have a server in between.


The demo is nice and clean. How does this compare to Kurento, or Janus?

Edit : I see Kurento is now assumed dead thanks to Twilio buying them, and I understand Janus doesn't provide any client libraries.


Kurento is not dead. New releases are published from time to time. And its companion project OpenVidu provides a lot of features. Mediasoup and Janus are also really good projects.

Disclaimer: I'm OpenVidu project lead.


Is there a specific reason hindering you from publishing a Debian package, or becoming/appointing a Debian packager, so that your product is available to anyone restricted to official package Debian sources?


Anyone can be a Debian package manager and publish their package in their PPA repository, mediasoup is a Node.js library, think of it as another NPM dependency for your project


Actually Janus does provide a client library. Yeah kurento has kind of ceased to evolve but Janus keeps up with the latest edition of the webrtc standard which continues to evolve and improve.


Janus works pretty well, the endpoints are well documented and allowed me to write a small web app in python for connecting peers through janus. It's not too complicated


mediasoup (in server side) is a Node.js library or a NPM dependency that you integrate into your Node.js app. Of course it comes with tons of C++ lines but, from the point of view of the user, it's just yet another NPM dependency into your Node.js project.

mediasoup overview here: https://mediasoup.org/documentation/overview/


So no documented c++ library to bind other languages through? server side needs to be node.js?


If you want that, probably Janus is something worth looking at.


It's a Node library, yes.


It is more like building block than one off video mettings solution.

Kurento still the easiest to deploy I think. Used in OpenVidu.


Has anybody compared MediaSoup with Kurento? https://www.kurento.org/


The team behind kurento got hired by twilio some years ago. Now it's basically dead with some very minimal maintenance being done.


I was asking about differences in the features they support, and results in real life experiences.

Also, I don't agree with Kurento being dead (off topic, but hey). It is still being maintained and works perfectly well for modern applications.


Neither is available in Debian repos.


mediasoup is not an application, is a Node.js library. It's yet nother NPM dependency in your Node.js application. No reason to have a DEB package.


Fair point, but there are standard ways to package npm deps in Debian: https://packages.debian.org/buster-backports/npm2deb

For restricted environments limited to things you can get from standard repos this is the only way to get NPM deps.


Yep, I know. However the effort required to make and maintain DEB/RPM packages for different architectures (mediasoup has a C++ component) is huge.


It's a best we have as wrtc SFU today. Thank you guys.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: