I've looked at WebRTC a few times, and every time I've been overwhelmed by the complexity of the protocol. I understand video streaming is hampered by codec patents, etc. but in 2020, the situation seems to be getting better with open source codecs unencumbered of patents (VP8,VP9, AV1, etc.) Why is the best we have still WebRTC? Seems it could be simplified as a protocol .. is this inherently hard or just a result of being designed in a committee?
It's way too complicated. I suspect it's some design by committee with large backer interests interfering.
WebRTC is a collection of underlying protocols, SDP, STUN, DTLS, RTP, SCTP. A superficial glance it seems to make sense, each of these RFCs provide some aspect needed in WebRTC.
However. These standards are from a naive happy time when the internet was open and routable, which means it's only some subset of said standards needed. The main WebRTC RFC fails at pinning down which, so it's down to the implementations to find some happy subset that works.
Trying to implement it is so frustrating. At every corner you follow links to some underlying RFC, start reading and coding only to realize "is this even used?!"
SDP is maybe the single worst thing in this mess. It's a terrible flat file description of structured data organized differently depending on "plan-b" or "unified". It would be super easy to convey what SDP tries to convey in any purpose built format.
On a conceptual level there are too many abstractions in the API. MediaStream and RTPTransceiver are my two pet peeves. MediaStream is maybe nice in client code to group some tracks together into a player, but the abstraction should stay on that level. RTPTransceiver is just beyond me. Why do I want it? How does this help?
It looks like RIPT[0] is people's answer to WebRTC's complexity.
I personally like WebRTC. Maybe just Stockholm syndrome though :) I see everyone saying QUIC is the answer, but all the complexity scares me. I imagine in 5 years everyone will miss how WebRTC is built with small building blocks.
WebRTC also bridges with a lot existing telephony stuff, which is nice! Since it talks RTP/SRTP I see a lot of people wiring it up to their older systems which is kind of cool!
You complain about SDP and about RPTTransceiver. But RTPTransceiver comes from ORTC, which was exactly introduced to avoid messing with SDP all the time. And you need it to specifiy simulcast layers. You don't need to use it.
I think the main problem is that webRTC started from VOIP standards. That's why we're stuck with SIP and thousands of RFC's that seem very daunting for a newcomer to webRTC. I guess that's also why so many webRTC developers seem to come from spain (they were always big in the VOIP world with companies like telefonica).
It's getting slowly better since the standard introduced concepts from ORTC.
Can someone explain how mediasoup differs from the RTCPeerConnection object (and related events) discussed in this Mozilla webrtc tutorial [0]? Given that it uses c++ I figure that mediasoup is more than just a wrapper around this? Any usability or reliability benefits of either?
mediasoup (server side, so the Node + C++ component you mean) does not implement "RTCPeerConnection". That's just needed for browsers. In mediasoup we don't use SDP but specific RTP parameters as defined here:
Ah, okay, whenever I think webrtc I assume p2p with no server but I am now actually reading into what SFU is, etc. Makes sense. Thanks for pointer to these resources.
Jitsi meet is a conferencing app. You likely mean jitsi videobridge. That's the SFU part and comparable to mediasoup.
Mediasoup has a bit more modern codebase and offers a rather low-level framework to build your own SFU. Whereas Jitsi videobridge is more of a ready-to-go SFU, but less flexible.
Mediasoup has very good node bindings, which may or may not be an advantage to you.
They offer similar (good) performance, although Mediasoup has a slight edge here. They're both very actively being kept up to date with the latest standards (in contrast with Kurento which is now as good as dead after Twilio bought the team). This is very important since both the spec and browser implementations are a fast moving target.
Disadvantage of mediasoup is that it is mainly maintained by just 1 or 2 persons and not yet used as much as Jitsi, so it's a bit of a gamble to start building your product on top of that.
Yep, two active developers but being just a set of libraries it's good enough. We also get nice contributions (C++ fixes and optimizations) via PR in GitHub. And we use mediasoup in different commercial products.
Jitsi is a full application (web app, backend servers) with a specific use case: meetings (similar to Zoom or Google Meet).
mediasouop is not an application but a set of server and client low level libraries to build whichever kind of audio/video applications (not just meetings). You don't "install mediasoup and configure it". You create your Node app and integrate mediasoup as you do with any other NPM dependency. Same in client side. More here:
Well, it's webRTC. So the main use case is live video streaming. But one would need to define 'live'. webRTC is really made for sub-second latency, which you need for conversations. If you don't require this you're better of using HLS streaming. Because achieving ultra-low latency does come with tradeoffs in complexity and quality.
webRTC is peer 2 peer, but that doesn't work if you have a lot of peers. That is where an SFU like mediasoup comes into the picture. That's a kind of relay server so you can send to many peers still over webRTC (thus with ultra low latency). Also, if the peers are behind firewalls, peer 2 peer also doesn't work and you need a TURN server to relay the video.
I think there's nothing stopping you from attempting that, but you would need some pretty complex client software to get a good experience with P2P live streaming.
You can use WebRTC for emitting raw data p2p, not just audio/video streams, right? Or would websockets be preferred if you wanted to, say, broadcast JSON objects to whoever was listening?
DataChannels are transmitted over the same UDP/ICE "connection" that is used to transmit audio and video packets. So if you plan to send real-time data (for example: real-time subtitles, metadata related to the current video position, etc) by sending such a data over DataChannel it will reach the remote without delay over the audio/video. If you use WebSocket to transmit the data, there may desynchronization between audio/video and data because they use a different network path.
Kurento is not dead. New releases are published from time to time. And its companion project OpenVidu provides a lot of features. Mediasoup and Janus are also really good projects.
Is there a specific reason hindering you from publishing a Debian package, or becoming/appointing a Debian packager, so that your product is available to anyone restricted to official package Debian sources?
Anyone can be a Debian package manager and publish their package in their PPA repository, mediasoup is a Node.js library, think of it as another NPM dependency for your project
Actually Janus does provide a client library. Yeah kurento has kind of ceased to evolve but Janus keeps up with the latest edition of the webrtc standard which continues to evolve and improve.
Janus works pretty well, the endpoints are well documented and allowed me to write a small web app in python for connecting peers through janus. It's not too complicated
mediasoup (in server side) is a Node.js library or a NPM dependency that you integrate into your Node.js app. Of course it comes with tons of C++ lines but, from the point of view of the user, it's just yet another NPM dependency into your Node.js project.