I am a VoIP developer and my recommendation for WebRTC is to just use any legacy Softswitch or IP-PBX. These are more matured software, with tons of features and all of them has support (also) for WebRTC.
Most of them has much better performance than Node.js implementations (you can handle more simultaneous clients). Both open-source and commercial software is available with good support:
Asterisk has a pretty colorful security record, and probably the others from the same era (early 2000's voip C/C++ code) are in the same league but either have escaped attention, or for the commercial side, just hush up vulnerabilities.
Author here: The first priority of the project is to offer a solid generic audio-only calling experience to our users. This includes all functionality that people would expect from a softphone. At the same time, we want to build something that is generic and useful to any SIP-over-wss provider in the short term. The build mechanism and module-system should accomodate this. It is still under construction (https://github.com/vialer/vialer-js/tree/feature/restructure...), but most of the hardcoded dependencies are already dealt with.
Besides the current SIP Call implementation, we are preparing the codebase for additional signalling implementations as well. The centralized SIP Call implementation is great to use with all the functionalities a PBX offers (queues, on-hold, transfers, being available to any device), but browser-to-browser p2p also has its strong points. Privacy-aware chat, video and file transfer seem easier to implement and maintain with nothing more than JavaScript and another browser as a backend. The signalling part will be a bit of a design challenge. Ideally, this would be p2p and encrypted too, so people don't need to rely on one server/provider to find their contacts and setup connections with.
> The core of the software is designed around the notion of a generic 'Call' abstraction, that allows it to be flexible about using different technology stacks
This quote is from the website linked from the GitHub repository. I skimmed the code to try to find the abstractions to switch out the technology stack, but could not find anything. Could you provide some pointers?
I need to change that description, because it is too premature at the moment. The main Calls handling code is in https://github.com/vialer/vialer-js/blob/develop/src/js/bg/m... and still relies on SIP. It is a small step to add the required abstractions for alternative signalling mechanisms though, because most of the SIP-related logic is centralized already.
It is a proprietary API-based click-to-dial calling method(which still need to be moved to its own repo) that doesn't use SIP or WebRTC, but uses the same call flow nevertheless.
At the moment, most of the focus is on stabilizing the SIP-based softphone. We're about to complete a first working opensource build without proprietary modules this week.
The Calls-related code will be dealt with when there is an alternative signalling mechanism available. It would be interesting to investigate existing networking solutions like Matrix, but also to experiment a bit with p2p overlay networks ourselves. I did some prior R&D on the subject (https://wearespindle.com/articles/end-to-end-encryption-betw...) to tackle issues like ECDH encryption with WebCrypto, but there are still a lot of uncertainties and unhandled topics that need to be addressed. Work in progress :-)
On first blush, Vialer seems to be softphone centric - audio only, mostly peer to peer, softphone features.
These others target one to many video streaming.
The biggest difference with the alternatives you mention is that Vialer seems to be an actual application while the others are more lower-level infrastructure / frameworks to build such an application.
I've been doing a comparison of these some time ago and here's what I remembered:
There are 3 ways of doing webrtc:
- Mesh -> everybody sends and receives to every other participant. -> Needs almost no infrastructure, but obviously doesn't scale well.
- SFU -> acts as a relay, so people only need to have 1 upload, but still download everybody -> Needs some infrastructure, scales better than mesh but still limited by amount of downloads per participant
- MCU -> makes compositions in the cloud so people only have 1 upload and 1 download -> Very expensive in terms of infrastructure, infrastructure is hard to scale, but least amount of traffic and processing by participants.
These seem to be the most well-known solutions:
- Kurento -> was very complete, but is more or less dead since Twilio bough the team (they had some recent activity, but are really lagging behind now).
- Janus -> a set of building blocks to build which can be used to build an SFU or MCU.
- mediasoup -> sfu with node bindings. Was very new when I did my comparison, no idea how mature it is now.
- Jitsi -> Very nice SFU, bought by Atlassian, but they still put a lot of work in the open-source version (contrary to what Twilio did with Kurento). Highly recommend if you want to deploy your own SFU.
- Intel WebRTC -> both sfu and mcu, but documentation is limited and it specifically targets intel platforms (originally based on 'licode' which is yet another alternative)
Next to that you'll also need turn and stun servers if you want to deal with any business networks (coturn seems to be the go-to if you need a turn server).
CPAAS:
Despite the 'open' in the name, opentok is a closed platform from the leading webrtc provider (tokbox). They provide the server infrastructure (usually SFU and Turn servers).
Some alternatives are twilio video, vidyo, temasys.
We eventually went for a CPAAS (tokbox), happy with them so far. We also use Janus for some customer-specific integrations.
Note that the summary is very, very limited. You may also care for a lot of other stuff, such as: Sip integrations, Broadcasting to HLS or RTSP, Recording/Archiving, Security practices, GDPR Compliance, Licenses, Browsers support, SLA's, ...
Does signalling phone home? Would be neat to do serverless signalling. It might be doable by putting the offer in the URL (and letting people text it to each other) or maybe packing the offer in a QR code or maybe using a DHT inside of JS. I have been toying w/ the IPFS JS impl, and have successfully connected to the DHT and done lookups over web sockets, but I would assume that the webrtc transport would allow you to participate in the DHT.
While I'm not aware of any serverless signalling solution that includes support for stuff like Trickle ICE, at least there are trustless E2E-encrypted solutions like https://saltyrtc.org/ (Disclaimer: I'm involved, but it's an open source project.)
If any senior VoIP/SIP/WebRTC dev is interested, my telemedicine startup just won the tender for building the WebRTC platform used by all hospitals in the Paris region. Includes some cool medical devices data streaming too. We're desperately understaffed so we're offering very competitive compensation, PM me. Stack is React+Node.js+Kurento but could change based on your input.
WebRTC is just a consensus between multiple parties about how to coordinate and send pure data, or just audio and video, from one peer to another.
Then that phrase would extend into multiple layers of complexity. This could be the first one:
- The "coordination" part is done with SDP, a type of plain-text messages that are used by both parties to reach an agreement about what video/audio codecs they are capable of understanding, and other similar details.
- The "send audio and video" part is done by using ICE to find suitable ports by punching holes in router NATs, which is a common problem we have in current networks. Once ports have been agreed upon, then the audio/video is sent using good old (encrypted) RTP protocol.
Anyone could implement this consensus in their applications, and so be able to send audio/video to other apps that implement the same details, but the most relevant implementations of this standard are those integrated into web browsers, because then they expose all this functionality to their JavaScript engines, thus allowing to control all this from a web page.
It's a standard way for browsers to engage in real time communications - most often used for video calls, but not necessarily limited to them. WebRTC is implemented in all the major browsers, and offers push button peer to peer video calls. Ghangouts and jitsi both use WebRTC. Slack and Hipchat use it. Skype is gradually moving over (from the MS proprietary oRTC), so even Microsoft Teams will use it soon.
-Asterisk: https://wiki.asterisk.org/wiki/display/AST/Asterisk+WebRTC+S...
-FreeSwitch: https://freeswitch.org/confluence/display/FREESWITCH/WebRTC
-Mizutech: https://www.mizu-voip.com/Software/VoIPServer.aspx
If you choose a softswitch which doesn’t have built-in WebRTC support, you can easily add it using a WebRTC-SIP gateway module:
-Doubango: https://github.com/DoubangoTelecom/webrtc2sip
-MRTC: https://www.mizu-voip.com/Software/WebRTCtoSIP.aspx
-Janus: https://janus.conf.meetecho.com/
On the client side you can use any RFC 7118 compilant WebRTC client:
-SIPML5: https://www.doubango.org/sipml5/
-WebPhone: https://www.mizu-voip.com/Software/WebPhone.aspx
-SIP.js: https://sipjs.com/