The point of moving it to user-space is that you can source a lot of the jobs of the TCP/IP stack to hardware directly. In some cases this dramatically speeds up your I/O.
Consider, for example that the dominant costs are things like demultiplexing and security checks. If you choose to implement multiplexing with virtual network cards then you get true 0-copy multiplexing, which is much faster than the software equivalent. And many of the security checks can be eliminated by using some combination of packet filters and logical disks. (The security BTW seems to be one big difference from RDMA, which might be an alternative, but I'm not really an expert.)
Some things can't be sourced to the hardware, like naming and access control. But that's fine.
(NB, I'm not arguing for this paper's position necessarily, I just thought it was interesting, and the motivation was good enough to start me thinking about how I might get around the kernel.)
The life of a data frame has become pretty complicated. Along the way it may pass through one or more layers of virtualization, one or more layers of network-specific mangling like iptables doing filtering and NAT, and a combo of the two in OS-based virtualized networks connecting VMs and containers both intra- and inter-host.
Pushing more of the stack into hardware is probably a good idea for single-tenant datacenters that can deploy a lot of e.g. Redis appliances, but those of us just renting capacity in the cloud are going to suffer from Amdahl's Law if you can only accelerate the part of the system adjacent to real hardware NICs.
Consider, for example that the dominant costs are things like demultiplexing and security checks. If you choose to implement multiplexing with virtual network cards then you get true 0-copy multiplexing, which is much faster than the software equivalent. And many of the security checks can be eliminated by using some combination of packet filters and logical disks. (The security BTW seems to be one big difference from RDMA, which might be an alternative, but I'm not really an expert.)
Some things can't be sourced to the hardware, like naming and access control. But that's fine.
(NB, I'm not arguing for this paper's position necessarily, I just thought it was interesting, and the motivation was good enough to start me thinking about how I might get around the kernel.)