Hacker News new | past | comments | ask | show | jobs | submit login
The Need for Web UDP (github.com/maksims)
100 points by mrmoka on July 26, 2017 | hide | past | favorite | 127 comments



> Requirements

> 1. Security - it has to benefit from SSL.

There is a lot more to security than transport-layer encryption and authentication.

> Connection based

UDP is too hard, so re-inventing TCP?

> p2p is not reliable ... monetization

Oh, you want require the SAaSS[1] model.

> Simple to use

The already-stated "requirements" are asking for something more complex than WebRTC.

> Minimum header overhead

Wait, are you thinking about using UDP to transport HTTP?! Do you even know what your MTU is?

> WebRTC suffers from complexity

That complexity exists for reason. Nowhere in this document is a discussion of the potential problems of using UDP or the ways tthe new service might be exploited by malicious actors.

[1] Service as a Software Substitute


> There is a lot more to security than transport-layer encryption and authentication.

You are welcome to PR.

> Connection based

This could be explored: starting from simple handshake, all the way to fully connection based protocol. Open for discussion based on developer needs.

> The already-stated "requirements" are asking for something more complex than WebRTC.

You are welcome to highlight specifics that makes you think that way.

> Wait, are you thinking about using UDP to transport HTTP?! Do you even know what your MTU is?

UDP is not streamed but message based protocol. As WebSockets implement their transport layer over pure TCP, WebUDP could implement it's own layer over pure UDP for various reasons.

> > WebRTC suffers from complexity

> That complexity exists for reason.

For P2P type communications, this complexity is perhaps reasonable.

For Server-Client type communications not at all.

> Nowhere in this document is a discussion of the potential problems of using UDP or the ways tthe new service might be exploited by malicious actors.

This document is initial effort to bring public discussion to form a reasonable shape of what WebUDP could look like. You are welcome to participate.


> UDP is not streamed but message based protocol.

I'm very[1] familiar with the IP family of protocols.

> Open for discussion

If you don't know what your requirements are, you shouldn't be choosing a transport technology. It sounds like you want an library that wraps WebSockets or WebRTC and handles most of the complexity.

> WebUDP could implement it's own [transport] layer over pure UDP

Then you want TCP. The only reason to use UDP is to avoid the complexities of a transport layer. Transport reliability is very hard; this isn't something that is easy to re-implement by yourself in UDP.

More importantly, I take it you don't know what your MTU is? The Maximum Transmission Unit[2] the maximum packet size. On ethernet-based networks, it's probably ~0-100 octets less than ethernet's 1500 octet MTU. You need to keep UDP packets under this limit, or they will fragment. Fragmented IP packets may not arrive at all and even when they do, the OS will wait until all fragments arrive before passing the data up to the app. If you're insane and send HTTP headers in each packet, you've wasted most of your data space. Each packet? Shouldn't we send headers in the first packet only? Except that every packet IS the "first packet" in stateless protocol like UDP. It's the transport features of TCP that create ordered-data semantics.

[1] I used to write firmware for embedded systems. That included writing - from scratch, in Z80 asm - the entire network layer: ethernet (RealTek), ARP, IP, UDP, TCP, SNMP, HTTP, etc.

[2] https://en.wikipedia.org/wiki/Maximum_transmission_unit


The tone of both of your posts is very demeaning. You completely tear down the proposal, making lots of – possibly correct – assumptions, without showing the slightest of doubt about them or any willingness to engage in a constructive debate.

And while you seemingly have a lot of experience with networking programming, and I have hardly any, I think I can easily correct you on the argument of a small MTU.

Many websites that have a more real-time networking need, currently use Websockets over a TCP channel separate from the HTTP stream. Over this channel many small messages are sent, for example updating the player's coordinates in a game. It is tolerable that some of these message may be delivered out of order or may be lost.

Wouldn't that be good fit for UDP?

Aside from that, you say that "the only reason to use UDP is to avoid the complexities of a transport layer", which I think is exactly right. You then say "transport reliability is very hard", which I can imagine it must be.

But then transport reliability is only one of the possible features of a transport layer. The whole point is that when you have access to UDP as a transport primitive, you can pick and choose which transport layer features you do and don't require on top of that.


There are truths about networking, whether people like them or not. They are not debatable, they are facts, so there's not a need to engage. If someone doesn't want to learn, there's not much to do.


This person seems to be trying to turn UDP into TCP. This seems to be because they don't understand TCP. Sometimes showing people reality isn't about being 'demeaning'.


You probably have not read the document. It's clearly stating - there is no need for reliability and ordered delivery of data. The only requirement from security point of view is congestion control. Please, do read the topic content before making assumptions.


>> Minimum header overhead

> Wait, are you thinking about using UDP to transport HTTP?! Do you even know what your MTU is?

> If you're insane and send HTTP headers in each packet, you've wasted most of your data space.

I think header in the requirement just means protocol/PDU header, which is required to deploy any kind of higher level protocol on top of UDP. Just like the UDP header and the IP headers are used at the start of data received in ethernet frames. These headers can be anything depending on the use case, even only a single or even zero bytes. The whole document does not mention that sending HTTP headers on top of UDP (which obviously could cause MTU problems, but nevertheless this kind of mechanism was standardized with CoAP).


> I'm very[1] familiar with the IP family of protocols.

Great, you have an expertise to contribute. And with this discussion you already are.

> If you don't know what your requirements are, you shouldn't be choosing a transport technology. It sounds like you want an library that wraps WebSockets or WebRTC and handles most of the complexity. There are "libraries" that wrap over some protocols. They don't make adoption of underlying technology easier, WebRTC in this case.

This effort is to establish clear requirements, not yet to propose exact solution. So this discussion is just a process of establishing those requirements.

High level requirements are clearly stated in the document, and important one is: simplicity. Currently there is no simple solution for low-latency server-client communication.

> > WebUDP could implement it's own [transport] layer over pure UDP > Then you want TCP. The only reason to use UDP is to avoid the complexities of a transport layer. Transport reliability is very hard; this isn't something that is easy to re-implement by yourself in UDP.

Layer on top doesn't have to involve full on replication of TCP techniques or alternatives. Just like WebSockets only adds very little to pure TCP in form of a data framing (header) for messages.

> More importantly, I take it you don't know what your MTU is? The Maximum Transmission Unit[2] the maximum packet size. On ethernet-based networks, it's probably ~0-100 octets less than ethernet's 1500 octet MTU. You need to keep UDP packets under this limit, or they will fragment. Fragmented IP packets may not arrive at all and even when they do, the OS will wait until all fragments arrive before passing the data up to the app. If you're insane and send HTTP headers in each packet, you've wasted most of your data space. Each packet? Shouldn't we send headers in the first packet only? Except that every packet IS the "first packet" in stateless protocol like UDP. It's the transport features of TCP that create ordered-data semantics.

This is really valuable input, and is exactly what is needed to fuel further requirements for an implementation.

You know, you can provide valuable input without trying to "argue" ;)


All this discussion about UDP and MTU reminded me of an old StackOverflow answer of mine (https://stackoverflow.com/a/276096):

"Alternative answer: be careful to not reinvent the wheel.

TCP is the product of decades of networking experience. There is a reason for every or almost every thing it does. It has several algorithms most people do not think about often (congestion control, retransmission, buffer management, dealing with reordered packets, and so on).

If you start reimplementing all the TCP algorithms, you risk ending up with an (paraphasing Greenspun's Tenth Rule) "ad hoc, informally-specified, bug-ridden, slow implementation of TCP".

If you have not done so yet, it could be a good idea to look at some recent alternatives to TCP/UDP, like SCTP or DCCP. They were designed for niches where neither TCP nor UDP was a good match, precisely to allow people to use an already "debugged" protocol instead of reinventing the wheel for every new application."


You are absolutely right. But we need to remember - our motivation is to get a solution without most of TCP mechanisms to suit cases where it is not required. SCTP - is more of an alternative to TCP and implements mechanisms that are not required by our motivation. DCCP - is more close to what we want as it solves security concern: congestion collapse.

And you are right - we do not want reinvent the wheel.


You might want to look at how many of these problems QUIC would solve for you.


> UDP is not streamed but message based protocol. As WebSockets implement their transport layer over pure TCP, WebUDP could implement it's own layer over pure UDP for various reasons.

As soon as your message exceeds the MTU, things get complicated. Sure you can layer something to re-assemble, but if packets are dropping, this is going to start getting problematic really fast. And if packets are not dropping, then TCP shouldn't overly increase latency anyway.


Packets are always dropping over wireless. In games and VOIP you skip and wait for the next one.


This is something that worth exploring further to identify if it is a requirement for a WebUDP implementation.


This seems to come up every now and then, and I see the same arguments, and they're still not compelling.

WebRTC data channels in unreliable mode will work just fine. Is it as easy as opening up a WebSocket connection? No, it's not. Is it as easy on the server side as accepting a WebSocket connection? Also no.

But it really isn't that hard[0], and people have built libraries to help you out. So just use one, and move on with your life.

And you also benefit from a standard that has been fleshed out over multiple years by some very smart (if imperfect) people.

On the browser side, it's already supported by all major browsers, with the notable exception of iOS Safari (which should change this fall with the release of iOS 11)[1]. Even though it's not ideal, you can fall back to WebSocket for the few holdouts.

[0] Source: I've done it before, building it from scratch.

[1] https://caniuse.com/#feat=rtcpeerconnection


Posted this recently in another thread, but maybe someone will find https://github.com/seemk/WebUdp useful as there have been many WebRTC threads lately. It's a WebRTC DataChannel server implementation for out of order, unreliable UDP traffic for browsers that has built-in SDP signaling and STUN.


Just looking at the complexity of "minimal" implementation only highlights the need for different solution.


Anyone interested in this might find some enlightenment in Glenn Fiddler's whitepaper. [1] It talks about the issues with TCP-based connections, WebSockets, QUIC, WebRTC and provides a solution (with code) to doing UDP in browser.

[1] https://gafferongames.com/post/why_cant_i_send_udp_packets_f...


Update: there is now a full browser-based implementation of netcode.io too! https://github.com/RedpointGames/netcode.io-browser


I think you're going to need a stronger argument for "Why not WebRTC", or at least concrete proposals about what parts of the API you would change. For example, "SDP parsing is too complex, have a JS API to set up a client-server connection without a SDP".

IMO the most complex parts of WebRTC are the SRTP-DTLS encryption (except you also specified TLS as a requirement for Web UDP), and STUN/TURN (which are optional and not required for client-server).


This is valuable input, you are welcome to contribute in form of PR!


Well, the problem is that I currently think WebRTC 1.0 is sufficient. For the example I gave, I think a simple JS library to write out the SDP is a perfectly fine solution. So I'm more asking, what did you find wrong with WebRTC?


Simply look at present state of WebRTC adoption by back-end services for use in server-client cases. Fist of all it is not designed for server-client cases. Second, after so many years, it is still not well adopted and used for server-client cases due to it's complexity on both sides - back-end implementation and front-end usage.

If you compare it to speed of adoption of WebSockets, variety of WebSockets implementations and how much it is used commercially.

There are many conversations by many developers who as well attempted using WebRTC for server-client communications, and it is apparent trend.

One of the developers of WebRTC team at Google who worked on DataChannel and network code admits himself that their team is aware of complexity and difficulties with WebRTC in server-client cases: https://news.ycombinator.com/item?id=13266874


>Fist of all it is not designed for server-client cases.

WebRTC is specifically designed for server-client cases. In the case of video and audio, the server is called a "mixer" or MCU. This is extremely important for large multi-user conferences - if it weren't for a mixer, you'd have to upload your video once for every user, burning all of your upload bandwidth. Examples of services using this are Appear.in Premium and Cisco Spark.

Note that games are the #1 use case mentioned in the Datachannel draft [1]. It's perfectly fine to argue that the standard is bad, but it was designed for these use cases.

The Google developer you linked seems to be arguing for improving the WebRTC C++ library, not changing the protocol.

[1] https://tools.ietf.org/html/draft-ietf-rtcweb-data-channel-1...


> You can't compare WebSockets with WebRTC because the first one has a much larger implementation base across browsers than the later.

This is exactly what I pointed out: WebSocket has much larger implementation base across browsers and back-end due it's simplicity.

> There is also a list of various transparent fallbacks that can be implemented like long polling which WebRTC doesn't have.

Those fallbacks were temporary solutions during adoption of WebSockets which today are obsolete as WebSockets are well adopted and pure WebSockets implementation is fairly simple.

This is what we want from WebUDP for server-client cases, which we can fallback to WebSockets if WebUDP is not supported.


You can't compare WebSockets with WebRTC because the first one has a much larger implementation base across browsers than the later.

There is also a list of various transparent fallbacks that can be implemented like long polling which WebRTC doesn't have.


Firefox OS had TCP[0] and UDP[1] sockets as Web APIs. They were quite pleasant to use as a developer and the only way to create a real mail client (needs TCP to implement POP, IMAP, SMTP in JS).

I wish more Firefox OS APIs had become web standards. They would allow for some very powerful PWAs.

[0]: https://developer.mozilla.org/en-US/docs/Archive/B2G_OS/API/... [1]: https://developer.mozilla.org/en-US/docs/Archive/B2G_OS/API/...


> Firefox OS had TCP[0] and UDP[1] sockets as Web APIs. They were quite pleasant to use as a developer and the only way to create a real mail client (needs TCP to implement POP, IMAP, SMTP in JS).

The biggest problem with Firefox OS's TCP standard is that it used an event-driven model, which is somewhat at odds with more current promise-based thinking. The more natural version is to use something based on WHATWG Streams, but that spec has been stuck in vaporware-land for years.

> I wish more Firefox OS APIs had become web standards. They would allow for some very powerful PWAs.

The TCP specification actually was undergoing standardization (reformatted to use the streams API): https://www.w3.org/TR/tcp-udp-sockets/ . The problem was the working group ended up closing down, and since the specification wasn't suitable for use on webpages, it ended up with nobody to host it.


Those implementations were created to be utilised within FirefoxOS where applications are granted permissions by user when installing them. And access to several API would be strictly regulated by OS it self.

In Web context this approach wouldn't work and would lead to security issues. Just like there was need or WebSockets (TCP), there is need for similar API for UDP but it cannot be pure access for creating UDP connections as this leads to many security concerns.


I am a fan of the "just stick a security popup and let people choose" option but I understand that is not the same as having proper security.


Chrome OS supports the same thing: https://developer.chrome.com/apps/app_network

The SSH and mosh apps use a binding between the POSIX socket interface on the Native Client side and these APIs on the JS side.


For this to be taken seriously, I think you need to demonstrate that you're capable of using WebRTC to do what you need. Without a decent nod in that direction, people are going to think that you're just not aware of the potential complexities.


Worth mentioning again: this effort is to explore server-client low-latency, not peer-to-peer scenarios which WebRTC solves well.

And this is collaborative effort, not personal. So all input is welcome.

I've used WebRTC for p2p and server-client cases, and it is nightmare for later. And many other developers have expressed very similar experience when it comes to server-client cases.

Even more, after many years we see very little adoption of WebRTC for server-client cases due to it's complexity. WebSockets on the other hand took very little time to get adopted by many back-end platforms as well as browser vendors. I wrote my own WebSockets solution long time ago on .Net before 4.5 .Net was released (includes own WebSockets implementation).


Using UDP with the web on a wide scale will either replicate everything in TCP, inside UDP wrapping, or else cause big problems all over the Internet.

It will be fast only in the beginning when a few clients are participating, but then screw over the infrastructure with degenerative congestive behaviors when "everyone" is on it. And by then, it will be a standard everyone is stuck with, with the only way out being to complicate it with a tirade of hacky refinements based on guesswork combined with crossed fingers.

That's not even considering malicious interference: what sorts of attacks will be discovered on this new UDP based shit, and what sorts of hacks will be required to mitigate them.


I know that was the conventional wisdom, from back when backbone links were the bottleneck. But in the modern internet, almost all the congestion is at the edges.

Since most traffic for games is server->client, most of the congestion will happen when several users are competing for the same customer link (DSL or cable modem). This already happens with streaming services, and people just yell at each other to stop downloading updates while I'm watching Netflix.


That could change one fine day when those edges get their stuff together.

Indeed, the subscriber lines and surrounding edge hardware have not kept up with the times. Depending on where you are and who your provider is, chances are you're getting the same shitty line rates you had ten years ago (or more), though you have more memory, a bigger hard drive and a faster CPU, and the backbone is faster.


Anywhere with a lot of Gigabit Fiber installed, the congestion is _not_ at the edges. It's further in. If all of those installed Gigabit edges simultaneously used 1 Gbps download, it wouldn't happen. They'd get much less.


Too late... Chrome already uses HTTP over UDP. https://en.wikipedia.org/wiki/QUIC


And choose the first option: replicating all the important parts of TCP inside UDP. Which is fine, but a lot of effort.


But it doesn't open a ridiculous number of tabs at the same time quickly and with low resource usage, so it's supposedly dead in the water now. :) :) :)


SCTP would be the logical choice for supporting both streams and messages.

No one really wants to support a network with the evil of arbitrary UDP from the browser. In SCTP, the handshake combined with crypto tricks can allows a server to make sure the initiator stores a larger cookie than it needs to hold for verification, throttling the DDOS riffraff.



The reasoning there is kind of lame, every filter knows how to pass SCTP or UDP and doesn't pass SCTP since it is unpopular and UDP because stopping it is most of why you are filtering.

Making an SCTP web standard would improve endnode support (and actual app use) which is beginning to wane and are SCTPs adoption problems.


There is already an extension to SCTP to encapsulate it in UDP, it isn't in Linux but is in the FreeBSD reference implementation.

The userspace SCTP stack that gets built into Firefox for use by WebRTC will do it too.


The point of providing SCTP instead is not to open up UDP and all of the baggage of existing services that will answer expecting you to be you and not a relay for a malicious ad in your browser.

Similarly, once you have a userspace SCTP stack you have allowed the garbage to reach userspace resources.

Everyone knows how to not be a jerk while using UDP or SCTP, but people who have the goal of being a jerk are more manageable if you only give them remote SCTP access.


Your security scenario only really works if you are happy with a symmetric firewall, that is one with the same filters in both directions.

My home firewall is set up to allow anything originating here to pass but block most things from outside. For this to work the firewall needs to be able to track the state of the protocol exchange which will be different for each protocol. Few firewalls can do this for SCTP or DCCP yet, I'm in the process of adding SCTP support to the one that I use.


In case of DCCP it can be implemented over UDP utilising existing firewall support for UDP traffic.


The grandparent poster stated that they don't want to encapsulate stuff in UDP.


Generally, yes. I suppose if the client were unable to override the default recipient port for DCCP-UDP Encapsulation (6511) and the DCCP implementation were enforcing rules such as back off on the client then it is still a better option to give a client a DCCP socket that can enable UDP encapsulation rather than giving inspecific access to UDP.

The point that seems to be getting lost is not that I want SCTP or DCCP support. Its that I don't think anyone should accept anything that could become a UDP arbitrary access loophole. The point of the current path is to replace the problem techs and add use cases safely as we go to gradually pay for a better network by making standards that aren't just the easiest thing for web-devs.

Every time someone tries to walk too close to the edge in a way that can open security problems for people who aren't running a server fully opted-in to web 2.0/etc, they risk a security backlash that could ban browser updates and effectively delay/kill unrelated innocent features and fixes with ripples for ~5-10 years.


It is indeed a bit like chicken-and-egg situation here.

But SCTP implements reliability and ordered delivery making it more of an alternative to TCP, than a solution for low-latency communication with cases where reliability and/or ordered delivery is not required.



I'm not familiar with DCCP, but as long as browsers don't get access to DCCP-UDP encapsulation without some work by the user it seems like another alternative to use to use web security considerations as leverage to ween a critical mass off of UDP.


rfc3257 page 3:

> SCTP supports the transportation of user messages that have no

> application-specified order, yet need guaranteed reliable delivery.

But most people don't really care (that insisting on re-delivering out of date packets will use some bandwidth,) they just have an existing UDP app they want to port to the web and converting it to SCTP is easy enough (I think I've even used an LD_PRELOAD to convert for that? But I may be confusing a different ULP substitution.)

Making them do that and set everything up is actually a good mechanism for protecting the rest of the internet and allowing a web SCTP that browsers could enable even by default, with less risk of in page ad hijackings, etc. Allowing UDP is something that I hope every browser leaves as impossible to allow without going into configuration or being in an entirely different use context of web APIs than a browser.


WebRTC uses (UDP encapsulated) SCTP.


So many of these comments are like "UDP is not TCP!" or "Muh TCP guarantees!", then they go on to mention WebRTC?!!!

I mean, if you're streaming audio and video Real Time, is there really any point to TCP? If a few frames get dropped, then bursted back once the connection stabilizes, does that improve the user experience in any way?

WebRTC seems like a perfect candidate for UDP communications for the actual media streams.


WebRTC is best option for media streams today for peer-to-peer cases.

The goal of the topic is to explore simple option for server-client communication using low-latency communication, without reliability and without ordered delivery.

WebRTC can be used for such case, although it is not designed for it. Due to that implementation is very complex and not much adopted. This is something we trying to explore, either new API or simplifications to WebRTC to make it simple choice for UDP in server-client scenarios.


Given the kind of feedback happening (and my initial reaction admittedly was similar), maybe a different name would help with perception (I'm already not a big fan of "WebSockets", but "WebTCP" would have been worse), promoting the goals over the implementation detail of UDP?


If there is alternative transport protocol to enable low-latency delivery as efficient as UDP, then it is worth exploring.

UDP - is something been used by many industries for long time and is well known protocol to make a foundation from.


It just seems like most real-life usages of WebRTC are either media-centric where TCP does more harm than good, or multiplayer scenarios where it's a wash since TCP's ordering is a pro while TCP's overhead is a con.


TCP congestion control can be very beneficial for streaming applications (when combined with adaptive rate codecs).

With UDP you have to create your own feedback mechanism to find the optimal bitrate to stream to the far side at.


This is where DCCP or alternative protocols come in play.


Wouldn't this make it pretty easy to have browser clients unknowingly participate in DoS attacks?


This is nothing new. It's trivial to make website visitors DoS a target IP address using XHR requests, WebRTC, <img> tags, etc.

WebUDP wouldn't necessarily make the situation any worse.


Like the attack on GitHub a few years ago: https://www.eff.org/deeplinks/2015/04/china-uses-unencrypted...


Seems to me that one key difference would be that giving javascript access to UDP sockets easily enables a single browser to send huge amounts of traffic since there's no ACK to wait for. With anything based on TCP, the attacker at least has to put some effort in to achieve that.


You could easily design it so that you can only send UDP packets to hosts you already have a TCP connection with.


It's definitely a concern! WebRTC solves this in two different ways:

1. The receiving end must consent to receiving packets by completing the DTLS handshake required by WebRTC.

2. The browser enforces SCTP congestion control to avoid spamming large numbers of UDP packets. Basically, it'll start throttling you when it detects packet loss.


This sounds like a bad option for games played over wireless connections.


Yes.

The only way this could be done safely is if you are required to establish a TCP connection, and then "upgrade" it to UDP. There are too many pointy sticks that developers could impale themselves on, with the side effect of creating DDoS vectors.


Can you do it safely if you're required to make an HTTPS request to the same hostname with some standardized request (a la CORS preflight) and get an answer back saying "Yes, this origin may connect to me on these UDP ports", and then hang onto that permission indefinitely? The nice thing about having a persistent permission instead of an upgrade on each request is that a service worker that's woken up by a push notification or similar can immediately send UDP traffic instead of doing a TCP (+ HTTPS?) handshake.

Since it's restricted by origin (probably using literally the same mechanism CORS uses) and over HTTPS, a malicious actor can't DDoS anyone but themselves.

If you're really worried, maybe extend it to allow the server to limit the duration of the permission and the maximum bandwidth, but I think you don't need that.

EDIT: Oh, oops, binding to the hostname doesn't help because you can just repoint your hostname in DNS at someone else. You'd ideally need to bind it to the IP, in which case you definitely want it to be a time-limited permission. (But I think that attack is also feasible, though a bit harder, with the handshake-before-each-UDP-connection model. Just have the attacker set up a custom DNS server for their domain, which for any hostname sends you a low-TTL response pointing to the attacker's HTTPS server, and then a higher-TTL response to the victim. Each user gets pointed at a random hostname within that domain.)


That model tends to be hell on firewall appliances and NAT/load balancing devices.


I'm not sure I follow - how is this different from, say, a native app (desktop or mobile) that's just using UDP like normal? There's a single preflight request per client device, which can usually be satisfied when you load the website itself.


It's different because a load balancer will send your initial HTTPS request to one application server, but when you're switching protocols (to UDP) you'll then want that traffic directed the same place. Or need some way to share session data across multiple app servers.

You could handle it with some kind of cookie/token that enables the LB to route to the right place, but that opens a whole bunch of other complicated logic too.

Whereas if the UDP application/server were able to handle it independently of HTTP(S), you wouldn't need any of that.


Either I'm not following you or I'm being unclear - it doesn't matter what HTTPS server the load balancer sends it to, all that you need is a reply saying "Yes, in general, sending UDP to my IP on this port is fine with me." You don't need to send it any other application-level data, and frankly the load balancer itself could send that static reply. So you don't care what application server it's sent to.

And once you have that permission, you never need to make an HTTPS connection again.

What you do care is that your actual application traffic, such as your login session, and your UDP traffic have some way of being associated with each other, but you have that problem regardless of how the client (whether it's a browser client or a normal desktop/mobile app) gets permission to send UDP.

As an example, the user could visit example.com, load some HTML and JS, send a request to login.example.com, get a session key, send a single HTTPS request to data.example.com exactly once, and then send UDP to data.example.com protected by that session key. You never send HTTPS to data.example.com again; from this point onwards you only send UDP. Coordination between your HTTPS login server and your UDP data server is no different in this model from the native app model.


I think we're on the same page. My point was I think it's too complicated to do all of those steps (and likely more) just to "switch" to UDP. You could instead make all of that part of the Web UDP application, which is no less complicated, but then you are removing one step of switching applications/protocols midway through.

I guess what I'm trying to say is UDP could work, but I think trying to bootstrap the initial connection/session info with HTTP is probably going to be FAR more trouble than it's worth.


UDP over IPv4 and IPv6 has source IP and source Port in header. This can be used for sticky routing through Load Balancers.


Sure, but in this context, unless you are able to use the same source port for TCP and UDP (and likely from the load balancer to the application server too) you'll still need another way to identify a client/session when switching from TCP to UDP.

Doing that with NAT is even trickier. Take a look at the way some firewalls need to configure a DMZ for gaming, or SIP for some examples.


I'd like to point out that getting SIP through firewalls is a massive headache even today. If you're embedding connection information into a control channel, then the firewall needs to do DPI to figure out what is a valid communication stream vs. some attacker. And then people start encrypting the control channel and it's game over unless you can hook your firewall up to whatever the controlling software is or your firewall is super fancy and can MITM the control channel traffic because you've installed the cert chain on the firewall.


You could also handle the "handshake" elsewhere, like the application layer. This type of identification is already done in other UDP-based protocols, like RADIUS.

UDP is arguably faster in some ways that is handled, provided the application load is managed well.


Putting UDP on top of TCP eliminates any possible benefit of using UDP.


If it is implemented with handshake mechanics same as WebSockets, then it would be no more harmful than WebSockets.


This is demonstrably not true.


So if the main justification of this proposal is "WebRTC is too complicated", wouldn't that more speak for a WebRTC library and/or server geared towards games?


WebRTC doesn't work for browser control of a lot of IoT type stuff. The really high volume cheapo devices speak things like CoAP or DTLS, and they don't have the horsepower to run something like WebRTC. You'd need a level of control similar to the berkeley socket API to get the browser to speak those protocols.

At the moment, I see a lot of ridiculous stuff like phone apps talking to some cloud instance which tries to jam the packets back through your firewall into your Internet light bulbs. Congratulations, you literally just used thousands of kilometers of fiber and billions of dollars of routing infrastructure to make the world's most expensive how many... light bulb joke.


It sounds like you're describing HumbleNet: https://github.com/HumbleNet/HumbleNet


And with WebUDP life for those guys would be much easier.


Last time I checked on WebRTC tutorials and examples was either too old and deprecated, or required the latest Chrome with some flags enabled. The best we have right now for real time streaming is HLS with up to 20 seconds or more delay. We had real time video chat 20 years ago, why can't we have it in the browser today !?


I think you checked too long ago. Appear.in, Jitsi, Discord, and Cisco Spark are all working with WebRTC now.


Yes, it seems like a server side complexity issue should be a fixable problem without updating the standards.

I suppose you'll still need to deploy a stun/turn server to deal with the NAT issues unless you're happy with IPv6 only, but that's not really something the standard can fix.


As mentioned in the doc, one of the options is to simplify WebRTC by making some components optional to enable it's better adoption and easier to implement on the back-end as well as on front-end.


This seems to propose that you can only connect to a certain server, similar to the same-origin-policy. What I would really love instead is the possibility to connect to arbitrary IPs to be able to implement real P2P. The linked posts dismiss this early because of the possibility to cause DDOS, but really, you can already do that from a hacked desktop "Quake", so there is no harm in being able to do it from a browser-based "Quake". What you'd have to prevent is drive-by use of UDP, not use of UDP period.

I would propose having two HTML profiles in future, HTML document, and HTML application (and maybe HTML legacy). HTML document would be restricted in what you can do, and would be primarily for reading Hypertext. For HTML application you would have to go through a few clicks to install or activate it - now you are going to say that people will just click it away, but that is already the case with current desktop app installers, so it is not more insecure! An application profile page will be able to access the net just like any other native application. Most importantly, it will be able to bypass same-origin policy and send UDP and TCP anywhere - but not with credientials of course.

You'd still have the problem of being able to probe internal networks, and being able to manipulate UPnP routers. For the first, the network admin could have a group profile setting or similar to disable this kind of access. For the second, browsers could selectively block this on a case-by-case basis if needed.

For the problem of DDOS, I think we should not let that restrict us from implementing useful technologies. Rather we should fix it at the source. For example, maybe one could lock down certain routes if an attack is detected. All traffic along these routes is throttled, unless you send along a proof-of-work token. I'm just making this up, but my point is that I think we haven't exhausted all options here.


For p2p, you already have the webrtc data channel which handles all the nasty NAT traversal and DDoS issues. Its browser API should be less volatile now than it has been in the past.

Implementing a WebRTC data channel endpoint in a server is not for the faint of heart, though. You would have to implement a lot of complex RfCs.


Yeah, but you need a central server, right? You cannot make a DHT for example, like Kademlia or trackerless Bittorrent.


You need a way to exchange the SDP message contents between your browser and the remote browser. How you do this is up to you. A central server is just the simplest way to do it.


"The linked posts dismiss this early because of the possibility to cause DDOS, but really, you can already do that from a hacked desktop "Quake", so there is no harm in being able to do it from a browser-based "Quake"."

No same-origin-policy would be lovely combined with XSS vulnerabilities.

Suddenly all the visitors of that website would be doing DDOS on a random host.


Well, I would not make that feature available from an included script from a third party domain, so XSS would not be an issue. I would make something like privileged application bundles that you had to authorize first - just like you have to install native apps. I know some people just click OK on everything, but there is no remedy to that other than giving up and not allowing general-purpose networking at all.

Also, people already exploit XSS for DDOS-ing, although not via UDP, but TCP/HTTP. Granted, you can possibly make a worse attack if you have UDP.


I've written a WebRTC "server" that can establish such a connection (and also acts as it's own STUN/TURN server) and hand off sockets to a local process.

WebRTC isn't very complicated.

The hardest part is probably ICE, which basically involves each point telling eachother what they see, and potentially consulting a third party (STUN/TURN). I'd love to see more magic there, but once that's in-place, I don't see what's so hard about just using DataChannels.

One idea might be to put signalling into HTTP headers, e.g. have the client and server introduce something like:

    ICE: sdp-desc...
and if so, allow WebRTC to skip the ICE negotiation step if speaking to the server.


I'd also like to draw some attention to this proposal: https://github.com/Maksims/web-udp-public/issues/6


This proposal lacks challenge/response and makes game server vulnerable to being used in DDoS amplification attacks if they use request/response pattern. Also, a proposal without packet encryption in 2017, seriously?!


Just use QUIC and advocate for its adoption.


That doesn't necessarily solve the problem which is in focus here. The QUIC version which is available in browsers (ok, Chrome only) is not only the QUIC stream layer, but also the QUIC HTTP adaption layer. From javascript side you just interact with HTTP and use QUIC under the hood. However with HTTP you also get all HTTP semantics (headers, reliability, ordering inside of request/response streams, etc.). What is requested is an additional protocol and API which avoids all the overhead and just allows to send and receive messages with best effort - because games and other realtime applications may prefer to build their own reliability mechanism on top.

The only way to use HTTP/QUIC for packetlike communication might be to send each packet inside a seperate HTTP request. But I guess that will have a super high overhead (lifecycle of a whole stream must be managed for a packet which actually has no lifecycle) and will most likely also not deliever the expected results (afaik HTTP on top of QUIC still has head-of-line blocking for request/stream establishment. Request headers must be received in order).

New javascript APIs which are utilizing QUIC could work. However one would need to explore if QUIC is actually helpful for target applications, since it provides a stream semantic, whereas UDP is purely packet based. QUIC might also introduce similar issues like WebRTC to the server side: It's a complex protocol spanning multiple layers (everything from flow-control to encrytion). Therefore it will be hard to integrate into environments where no QUIC library is available. But that's only a feeling, since I haven't yet reviewed the QUIC specification in detail.


One thing that I do not see mentioned here is multicast. Are there any advantages? Watching a live sports game, for instance. Since multicast is connection-less and sent only over UDP, the more distant discussion about introducing multicast into browsers never takes place. Having used a multicast video stream for many years in an enterprise setting I can unequivocally state that this would decrease network utilization. Especially in the years to come as the interwebs get clogged up with broadcast-type data.


The main goal of this effort is to explore server-client communication scenarios.


ISPs are not supporting multicast over the public internet.


They will if it becomes a standard way of broadcasting. They will save quite a bit of network load.


The tech has been standardized as IPTV, and it sees a fair bit of use. But not over the public internet (for one, because ISPs don't support it...)


Most home users probably allow UDP out of their network. How many businesses by default allow UDP outbound on any port?

Will the documentation/RFC's encourage folks to fail gracefully if UDP is not supported in their network?

Could this spec include support for SRV records? It isn't allowed in http/1.1.


Can't we just rewrite the Linux kernel in JavaScript and boot it off a browser

... /s (hopefully)


Why, yes. Yes you can.

http://jslinux.org/



netbsd beat you to this by 5 years or so: https://blog.netbsd.org/tnf/entry/kernel_drivers_compiled_to...


Are people aware of https://www.w3.org/TR/tcp-udp-sockets/? I didn't see it mentioned so far.


This spec been there for very long time, and has been adopted by FirefoxOS (deprecated platform). Which exposes low-level access to establish pure TCP and UDP connections with permissions flow by environment.

It exposes many security concerns, that's why WebSockets were more favourable over TCPSocket. We want similar for UDP.


Not having raw TCP means you can't speak to any existing TCP protocol through a browser. It might not be a useful feature but I think that not taking it into account is an issue because most use cases can't be implemented with WebSockets.


Ability to connect to raw TCP or UDP from a browser is a major security hole.

Due to that, WebSockets were created were handshake is handled by browser transparently from developer ensuring port scanning is not possible. As well as to preserver origin-based security model of HTTP. Which we want for WebUDP as well.


I read 5G is doing something similar, or rather something new to solve this problem. They are completely remaking the TCP/IP stack.

Anyone knows anything about that?


The ETSI NGP ISG[1] is looking at next generation protocols, also in the context of 5G.

"This ISG is seen as a transitional group i.e. a vehicle for the 5G community (and others of interest) to first gather their thoughts and prepare the case for the Internet community’s engagement in a complementary and synchronised modernisation effort."

The efforts seems to be in an quite early stage for now (architecture, models, requirements, etc).

I personally don't see TCP/IP going anywhere with 5G, but we may see more parallel deployments of protocols within isolated 5G network slices.

[1] http://www.etsi.org/technologies-clusters/technologies/next-...


Just tweeted this and plan to publicize. If we had web UDP we could port ZeroTier pretty easily to run in the browser, allowing web apps to coexist with machines on true virtual networks.

It'd probably be lighter weight than WebRTC, which is IMHO an over-engineered nightmare. I'd like to see just the A/V encode/decode parts of WebRTC live on and the rest of it get deprecated in favor of web UDP and open-ended browser based P2P implementations. That's what should have happened, not a monolithic katamari ball of a standard.


I think there is some value in having a (relatively) small number of transport-level protocols with known semantics.

If every web page would effectively reinvent their own transport-level protocol* including subtly different connection handling, congestion behavior and drop/reorder tolerance, that sounds like it would make life a nightmare for all network intermediaries.

(* I know the transport-level protocol is technically UDP but that doesn't count because all he practically relevant aspects are defined on top of it)

Also, ICE etc solves a real problem - that users of Web UDP would have to deal with as well. Why demand that everyone reinvents the solution for themselves if we can include a standard solution in the browsers for everyone?


I don't consider WebRTC over-engineered at all. It's not a trivial protocol, sure. But doing p2p sessions over today's Internet isn't trivial.

Yes, if you just consider a client-server model where the server has a public, routable, non-firewalled IP, then you can do away with ICE/STUN/TURN, and it gets simpler. But that's not what WebRTC was designed for.

The SDP stuff might seem a bit arcane, and I would agree with you, but it's a widely-used telephony industry standard and there are libraries that will generate/parse them for you without you needing to do much.

Bona fides: I've implemented server-side WebRTC in C++ and Java. Was it trivial? No. Was it ridiculously difficult? No.


It doesn't have to be as complex as the ICE/STUN/TURN stack but that's not really my point.

Building blocks and programmability are better for a long lived platform like the web than giant inflexible monoliths. UDP and web assembly lets you implement the stack you describe and anything else that comes down the pike.


The telephony argument for SDP was brought up a lot in the IETF. However, few SIP devices -- if any -- understand all the SDP extensions WebRTC has introduced. With RTP/RTCP the situation is even worse. You always need a gateway.


Any insight as to why this comment has been so heavily downvoted? I don't see anything that would justify it.


I don't know about other people, but it smells a bit to me like "this standard is too complicated, how hard can it be?", and then people implement their own thing and end up reinventing pretty much the entire standard, only in a nonstandard way.


Maybe you'd have more interest if you wrote up a proper RFC.


This is collaborative effort. I act from my capabilities, but people with certain skillset are welcome to contribute with proper RFC.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: