It made so much sense back then, when mobile apps were not that robust to networks changing, I assumed it's going to get adopted in no time due to how much of a ux improvement it would have been back in the day.
It's incredibly depressing that this gained barely any traction in the last 10 years, and kernel options are appearing just recently, after everyone has wrapped they http calls in multiple retry handlers, and mobile operating systems have abstracted network connectivity to the point where it feels more like you are using zeromq rather than tcp.
Multipath: There are several areas where TCP still has an advantage over QUIC. One of those is multipath support. Multipath TCP connections can send data on different network paths simultaneously — for example, sending via both WiFi and cellular data — to provide better throughput than either path permits individually.
Server connection migration is explicitly forbidden by QUIC:
It's on the standards track, rather than experimental, so likely to be supported once finished. There seem to be some implementations, including Apple:
I wanted to like it, and Apple included it in iOS, but supporting it on real servers was going to be too hard...
When I was deployed on FreeBSD with no load balancers, there weren't recent patches. And even if there were, I'd need to do some serious work to avoid advertising the private network ips as alternates...
When I was on Linux behind a load balancer, it's too complex to get the streams to the right place. And the load balancer doesn't want to do it anyway.
Processing two streams together involves a lot of complexity in a high throughput code path. It's a lot of risk, and you've got to reboot for changes.
And then you do all that work and it only benefits iOS users, who tend to be on better networks anyway.
> A U.S. analysis of Wi-Fi and mobile Internet usage across unique smartphones on the iOS and Android platforms reveals that 71 percent of all unique iPhones used both mobile and Wi-Fi networks to connect to the Internet, while only 32 percent of unique Android mobile phones used both types of connections. A further analysis of this pattern of behavior in the U.K. shows consistent results, as 87 percent of unique iPhones used both mobile and Wi-Fi networks for web access compared to a lower 57 percent of Android phones.
Lol, have you ever been to Europe? iPhones are definitely considered premium and there definitely are networks that are more expensive but offer better reception. In Germany, that would be Telekom, in Switzerland, it's Swisscom.
Yes used to live in Germany. I was talking mainly about the US though.
iPhone isn't always 'premium', since they have their version of cheap phones as well. Point is cell network service quality is independent from phone quality.
It sounds like this would have taken off if it were added to various managed cloud load balancers based on what you're saying.
The only question I have is if it opens up a different can of worms even if you've got a magic box terminating layer 7 for you or not. Never dug deep enough into mptcp myself to know.
I think it's a no brainer if it's no effort or small effort (set a socket option on the client, somehow)... but it's a big effort to support it in a large load balancing situation.
If you balance your load balancers with ECMP, I don't know if you can get two client streams to the same mptcp terminating place.
If you've optimized the heck out of your tcp flows, this throws a wrench in there, because the second stream is likely to get hashed into a different nic queue, and then you have communication between cpus to move forward on the logical stream.
It would have been really handy though, and solve real issues with real users.
Edit to add: it could also solve some issues on private networking / interserver networking I saw... although the contention would be a much bigger problem on higher bandwidth streams. On networks with link aggregation, while there are many paths from one host to another, usually path selection is by hashing the connection 5-tuple {src ip, dst ip, protocol, src port, dst port} so a long running tcp connection remains on the same path for the duration, if a path segment has high loss/corruption or is congested, MPTCP could help if you had an extra connection that hit a different path. Otherwise, you need to find the segment and get network operations to fix it; it's not easy to figure that out (i had to write a tool to sample and find port combinations with trouble and then a patch for mtr to run a trace with fixed ports) and then you still need to reconnect your affected tcp sockets unless you can get a quick response from net ops (sometimes they can check error stats once the right devices are pointed out to them, and then replacing a cable/fiber often helps, or disconnecting it during investigation can help the traffic flow across the redundant links)
> If you balance your load balancers with ECMP, I don't know if you can get two client streams to the same mptcp terminating place.
At Google, we do something similar with QUIC and connection migration. Our mechanism for ensuring these hit the same backend is Maglev [0], where we use the QUIC connection ID for hashing purposes in software. (Our routers still mostly use ECMP based on the 5-tuple, so being able to consistently hash to the same backend across multiple LB instances is crucial.)
> if a path segment has high loss/corruption or is congested, MPTCP could help if you had an extra connection that hit a different path.
Incidentally, we also have a family of internal mechanisms that do this, although we don't rely on MPTCP. (We instead twiddle some other bits in the packet that we make sure our routers use for hashing, at least for RPCs between prod machines.) This inspired some of the connection migration work in our QUIC implementation [1], wherein we can migrate to a different ephemeral port if we detect issues with the current path. This works shockingly often for routing around network problems.
> You might also be interested in SCTP[1] from the year 2000, which also hasn't gotten any traction so far.
Probably partly because middleware boxes (e.g., firewalls) either didn't/don't support it and/or rules were written to only support "TCP" (as opposed to 'stream') or "UDP" (as opposed to 'dgram'; see also "DCCP").
Certainly that's a part, but it didn't help that SCTP has some fundamental low-level flaws.
Given that TCP also has at least one unfixable flaw, the only recommendation I can make is to use something UDP-based - which, to make sure you don't stomp on everybody else's traffic, means use the only popular one: QUIC (the layer beneath HTTP/3).
The protocol is specified by a byte in the IP packet; how many middleware boxes block everything except for ICMP, TCP, and UDP? What is the probability that a packet with that byte set to something unexpected actually gets from source to destination?
The “funny” thing is that http3 really really looks like a transport protocol encapsulated into… uso. Exactly because many middle boxes block anything that’s not a very well known protocol
I was excited about it because we were working on delivery robots and I wanted a good solution for instant failover given 2 cellular modems.
We ended up going with PepLink's SpeedFusion to save engineering time. But the license was costly. I really hope for a free solution in the future for 2 cellular networks and <50ms failover.
Multipath UDP + OpenVPN would also probably be a viable solution.
what about something like this? two minipcie slots which i suppose you could put two cellular modems into. not sure what OS it runs though but presumably some flavor of linux.
I see it as depressing that this is gaining traction it doesn't deserve. TCP doesn't need one hack at a time and then to make us choose combinations that sort of work in half the use cases in the modern world, it needs to be replaced with SCTP.
I don't know which makes me sadder-- IPv4 only having a 32-bit address space or TCP using the source and destination IP addresses in the connection tuple. That's one of those "if I had a time machine" of things-- I'd go back and have Cert and Kahn change both of those items.
If TCP had a protocol specific identifier for connections (a couple of 32-bit values, for example-- a client nonce and server nonce) rather than using the source/destination IP addresses multi-homed hosts and seamless transition between different networks would become native features of the protocol. A client could roam between two different IP networks and TCP connections would "survive", for example. (I'm oversimplifying nearly to the point of hyperbole, to be sure...)
(Another fun future would have been one where SCTP got widespread adoption.)
a client nonce and server nonce) rather than using the source/destination IP addresses multi-homed hosts and seamless transition between different networks would become native features of the protocol. A client could roam between two different IP networks and TCP connections would "survive", for example.
This is mostly how Mosh [1] works and allows for IP roaming, changing IP's, etc... without losing ones SSH session. The connection can even be interrupted for a prolonged period of time and restore on its own on a new IP seamlessly.
How would routing be done without source/destination? When the device changes networks, how does the origin and all routers along the way know that this device is on a new network?
> How would routing be done without source/destination?
There is still a source/destination address. Routing still works. But those addresses are allowed to change without disrupting the connection because the connection isn't based on the values of these addresses.
> When the device changes networks, how does the origin and all routers along the way know
The routers don't need to "know" these things.
MPQUIC does this. To the network it's just UDP packets moving around. Connection state is dealt with at higher levels and doesn't rely on IP addresses.
> how does the origin and all routers along the way
It's just the origin that needs to know what address(es) it should be using as the destination at layer 3.
The big problems with this is that it depends upon things that weren't really feasible in the early 80's -- bigger packet headers, a bit more state on each side of the connection, potential need for cryptographic authentication.
* Where to send a frame to get to the other side of the connection
* Whose connection this is.
TCP combined the two, because we didn't have mobile clients or a lot of multihomed systems that would benefit from distinguishing them. Also, every octet in the header counted.
In practice, this means we have to keep building a lot of infrastructure on top of TCP (or parallel to it, in datagram protocols) to handle retries and splitting flows well. In turn, these things are completely opaque to the network and it's difficult to write rules about them.
Whereas if we had different packet fields for "where am I sending this packet right now" and "whose flow does this belong to"? we could write better firewall rules, have less infrastructure built on top of TCP, and have better typical application performance.
But the stuff that carries TCP is IP. That's why TCP can work seamlessly, because it uses identification from a previous layer. Consider I bind a server to an ID, and not IP:port, the operating system running it must know how to communicate that via IP, so there will be a corellation map somewhere and that map needs to be synchronized between all peers that wish to host the roaming server.
Otherwise you're just switching port (16-bit) value to arbitrary 32-bit identifier.
But... it doesn't? TCP has no notion of IP address in the protocol, only the port.
TCP with changing IPs can work e.g. on top of an ip-ip tunnel with applications not being aware at all.
> TCP has no notion of IP address in the protocol,
RFC793:
To allow for many processes within a single Host to use TCP
communication facilities simultaneously, the TCP provides a set of
addresses or ports within each host. **Concatenated with the network
and host addresses from the internet communication layer,** this forms
a socket. A pair of sockets uniquely identifies each connection.
That is, a socket may be simultaneously used in multiple
connections.
TCP uses the combination of L3 source address, L3 destination address, L4 destination port, L4 source port to identify what connection a frame is on. We're discussing how using that L3 information isn't necessarily ideal for today's world.
> TCP with changing IPs can work e.g. on top of an ip-ip tunnel with applications not being aware at all.
That's just because the IPs have not changed from its point of view: it receives the same frame with the same destination/source IP addresses the entire time.
Part of the reason why we need things like IP-IP tunnels is because L4 connections can't "move" with TCP. In scenarios where we're using tunneling for this, we're accepting worse performance than if we could just directly send TCP to its true destination and have it processed.
So you want to implement persistent connections on L4 without implementing persistent addresses on L3 first?
This doesn't make much sense to me. The hardest problem here is not assigning uuids to pipes, it's the routing/mapping of the "true destination".
- If you manage to solve it on L3, ip-ip tunnels or not — you have it, TCP works unmodified and so does UDP and everything else including quic and http/3.
- If you didn't solve it, then support for persistent connections in TCP is useless.
In another words I don't see what a "transmission control protocol" has to do with it. It's very reasonable to assume that addreses are already figured out when designing transmission control and that's exactly what TCP did.
> - If you didn't solve it, then support for persistent connections in TCP is useless.
SCTP and multipath TCP (which is what we're talking about) already do pretty much this. Assuming that endpoints to a stream connection have single, unchanging network addresses isn't a reasonable assumption anymore. But we're stuck with the assumption that hosts won't move in one of our most common protocols.
In the OSI model, you got similar functionality up at layer 5, but TCP only handles the connection/disconnection aspect of the session layer. In the internet world, we have a bunch of haphazard sets of retries, session balancing, multihoming and reconnecting behavior that are protocol specific (and completely missing from many well-used protocols) kludged on top. (Actually arguably MP-TCP is a session layer on top of TCP).
The only way you solve this on layer 3 is to build some kind of messy overlay network, because addresses have no real relation to where things are anymore. And we know that overlay networks are suboptimal and inefficient. Solving it at layer 4 doesn't have to be (but it's too late for that now).
The protocol would have to handle binding the network to the transport. MPTCP and SCTP both handle that via registering and un-register network layer endpoints. This parallel universe TCP would be the same in that regard.
The problem is that the TCP/IP model stops at level 4, and if we consider TCP a protocol of transport, it shouldn't do that.
In the OSI model what you talk about is level 5, that is session, but in TCP/IP there is no such level, thus it must be handled by the application (e.g. trough a session cookie, in HTTP).
Slavish adherence to theoretical models is a recipe for failure. Even worse, the OSI model was developed in the 1970s before successful internetworks existed so it's not informed by experience; it's mostly made up.
I recently bought a property where I cannot get a full fibre connection, but I can get 150-400 Mbps using 5G. I've been thinking about using dual 5G connections and tunneling my traffic via mptcp to a VPS to aggregate the connections.
I got fiber run to my neighborhood, and for a while, had a 1gb coax connection and a 1gb fiber connection. I used openmptcprouter to aggregate my connections through a droplet and I effectively had a 2 gigabit internet connection. I would have stuck with it, but having a datacenter IP for your home network really doesn’t work.
Except TCP is just a bad protocol to start with for tunnelling, because packetized data has to be delivered in-order, and head of line blocking messes up congestion control algorithms in the tunnelled data.
Why does this require explicit opt in by applications if there’s transparent fallback? Wouldn’t it make most sense for the kernel to do it transparently for every TCP connection so that it can make more global decisions about path aggregation / link preference?
My understanding is that it was basically a condition enforced by the maintainers of the Linux TCP / networking subsystems. If you look at the initial upstreaming discussions[1], this was setup as a ground rule.
If you look at the older multipath TCP implementation, prior to the upstreaming, it was intended to be fully transparent to the application, which I think makes more sense for the intent of the protocol. Sure, in many cases MPTCP may be better with application-guided logic, but having a standard system approach (e.g. establish sub-flows on an LTE connection for automatic failover, but don't send any data along those sub-flows) would have worked for 95% of cases.
Using this implies that there are multiple IPs per endpoint associated with a single TCP connection. That is going to need explicit support/awarness by the application in many cases.
The only practical use of MPTCP for me is to use mobile and Wi-Fi network together to boost the speed. iOS and WeChat both support this. However, I always turn them off because my mobile network is metered. So in the end, MPTCP is useless for me *personally*.
I work supporting, debugging, fixing the Linux network stack and drivers. I am amazed how little adoption this has seen.
Like everything which came along and tried to supplant regular TCP, such as SCTP, it seems MPTCP has also been confined to a niche of application developers who will use it forever while the rest of the world forgets about it.
Similarly, I promise you everything you've ever done over a 4G or 5G phone has used SCTP inside the phone provider. That doesn't mean the general developer population know about it. I bet most developers have never even heard of it.
I worked on IMS before it was widely rolled out and at least back then its was just SIP over TCP and UDP (RTP/RTCP) for media. SCTP was widely used for eNodeB - MME comms tho iirc
I found [1] which describes the architectural difference between MPTCP and QUIC, and also introduces the authors' proposed MPQUIC protocol:
> QUIC multiplexes application streams on a single UDP flow, whereas MPTCP splits a single stream on multiple TCP subflows. MPQUIC combines both features by multiplex-
ing application streams on multiple UDP subflows.
Note that MPQUIC is still being discussed at the IETF. At the last IETF meeting, more changes have been discussed. Unfortunately, that slows down its adoption. https://lwn.net/Articles/964377/
But both tries to achieve the same goal. Technically, you can have a very similar behaviour. MPTCP is implemented in the Linux kernel, while QUIC is on the userspace side.
Before we finalized the specs we did a lot of testing to make sure that enabling MPTCP wouldn't break connectivity. Either it would be passed correctly, or it would fall back to single-path TCP safely. Generally, so long as a middlebox passes through unknown options unmodified, and does not try to enforce that the TCP sequence space it sees is contiguous, MPTCP should work through that middlebox.
If you're interested, we wrote a couple of papers on this:
for example Great Chinese firewall: if you can split your traffic across multiple uplink channels, the firewall will have a hard time to put them together for enforcement?
The examples given on the page seem to focus on multipath to get to a device over the internet, but I can see this being more likely to work properly without needing to fallback on home networks.
At home/lan we use LACP, VRRP... I mean link aggregation and HA needs are solved time ago.
With multiple ISPs, or on a complex enough LAN, we can use multiple routing tables + weights too.
Also, if the ISP at home can do 10Gbps, 1Gbps, 300 Mbps whatever... I want to be able to use them with a single path, so there is no gain using multiple paths. Eventually, when I have cable+wifi connected at the same time, I use to force one of both, cannot see a reason to prefer using both at the same time.
Maybe the latency thing? Never had that issue at home, but could understand that usage case "just use the network segment with less latency to reach $thing".
> Also, if the ISP at home can do 10Gbps, 1Gbps, 300 Mbps whatever... I want to be able to use them with a single path, so there is no gain using multiple paths. Eventually, when I have cable+wifi connected at the same time, I use to force one of both, cannot see a reason to prefer using both at the same time.
>
I don't understand why you would want to be able to use them with a single path. the gain would be being able to aggregate them and have individual tcp streams faster than any one IP connection could handle.
Though personally I think the resilience is more appealing. Not having to have a hard cutover when wifi degrades as I walk away would be nice
If my ISP gives me 10 Gbps, I want my PC to have (at least) a 10Gbps single path to the router.
So, If I already have a 10 Gbps path to the router, I don't want to add a 300 Mbps failing air path added to my way to the router.
In the context of the parent (at home networks), I think most people has two paths... WiFi or RJ45-UTP. And with that multipath setup (WiFi + RJ45, I don't get why other comments are talking about cellular networks "at home") is not usual to walk away; right, you could walk, as far as long is the rj45 cable, but...
To keep HA on WiFi when walking around, there are other technologies more battle tested than MPTCP.
For a long time enterprise firewalls (and more recently SD-WAN) allowed load balancing between different links, but unlike MPTCP the traffic of a single TCP connection is not split up. This is in line with the established network admin wisdom saying that reordering packets of a TCP connection hurts performance.
Some ISPs in Europe are using MPTCP for people being too far from the street cabinets. Typically, for people in the countryside, with < 50 Mbps. Thanks to a transparent proxy installed in the home gateway, and servers in the ISP's network, they can combine both the fixed and cellular networks, and use the fixed one in priority.
MPTCP can also be very interesting for mobility use-cases, even when one network is used at a time, e.g. switching from WiFi to cellular, or different cellular networks in the train, etc.
We found that most proxies/firewalls (90%+ ? I forget) didn't tamper with it. The largest hurdle was working with load balancer vendors to implement it.
It made so much sense back then, when mobile apps were not that robust to networks changing, I assumed it's going to get adopted in no time due to how much of a ux improvement it would have been back in the day.
It's incredibly depressing that this gained barely any traction in the last 10 years, and kernel options are appearing just recently, after everyone has wrapped they http calls in multiple retry handlers, and mobile operating systems have abstracted network connectivity to the point where it feels more like you are using zeromq rather than tcp.