I really need to get my head across some of these things. I've been meaning to also look into https://www.kernel.org/doc/html/latest/networking/tproxy.htm... (in regards to udp(quic)/ztunnel-esque behaviour). This will serve as a great reference, thanks.
Come on HN. Do better. Silly conversations about the number of lines? How about any C experts and networking pros weigh in on this? Has anyone built it and tested it?
Among many other projects I touched, we had a database engine which planned to move to io_uring as soon as it could be made available (it wasn't distributed outside company, so it was a question of internal infrastructure)
PREEMPT_RT also includes a lot of possible low-latency work.
io_uring also provides ridiculous performance benefits for normal server code, too.
Context switches are expensive. Often more expensive than operating on the data. io_uring providews a major reduction, or even near elimination of context swithes for io. It provides big win in a lot of server contexts in terms of packets/iops/throughput.
Initial motivation was lack of non-blocking SPI I/O in Linux user space. Then it made sense to use it for everything, it’s a straightforward mapping to Swift’s structured concurrency, so why not?
libblkio, used by qemu for some block storage, uses it. Independently, qemu also uses io_uring for some Linux file operations.
My experience with io_uring is that it requires a fairly fundamental change in the way you architect software, which makes it hard to retrofit it to existing code (eg. code using poll). If you have a greenfield project that requires very high performance and will only/mainly run on Linux then you should strongly consider it. I'd actually love to hear peoples' thoughts on whether there is a good way to retrofit code.
Were any of this vulnerabilities disclosed? I saw that Google blocked it in some places, but without fully understanding what are these vulnerabilities it's hard to make a good judgement on whether it is usable or not in some application.
There are practical backward compatibility and security issues currently that prevent widespread deployment. I’ve been leaning into io_uring going forward because it is clearly the future and brings a lot to the table. It still can’t be used in many environments but I expect it to be mainstream in 2-3 years. Niche, but not for much longer.
Most software doesn’t properly use the old io_submit facility and that has been around for decades, despite the large performance improvements possible. Even when io_uring is widely deployable, most software won’t use it.
This only applies to Linux distro packages though right? Servers, embedded, mobile, etc only target specific kernels and so can avoid the backwards compatibility issues and in many cases have all of thr latest security patches.
Am I wrong? Just noticed the downvote. That is my understanding of the old AIO kernel API anyway. The original io_uring paper describes some of the deficiencies of AIO. And the fact that the API is broken on non-O_DIRECT regular file descriptions is well known.
io_uring has produced many security problems. Google has found it necessary to either completely forego io_uring or severely limit its use to trusted code[1]. This policy includes Chrome OS, Android and Google Cloud. Docker has also blocked io_uring syscalls (by default) from being called by containers for the same reason and, although they've flip flopped on this at least once, I believe io_uring is currently blocked.
This is highly disappointing to me, but also not surprising given the nature of io_uring. Solving it will require more design rigor, probably some performance compromises and possibly some degree of formal validation.
You’ve written the same message, almost verbatim multiple times in this thread.
Google also writes applications in golang, maintains their own version of the Linux kernel and runs a browser monopoly. What works/doesn’t work for them shouldn’t be treated as the only standard worth considering. FWIW, fb makes plentiful use of it as I understand.
Docker doesn’t block IO_URING syscalls, you just have to be able to lock memory.
Clearly you have an axe to grind about io_uring, which is fine, I guess? But you’re really going out of your way to make it seem like it’s the worst thing ever, which is a extra.
> You’ve written the same message, almost verbatim multiple times in this thread.
Until now I had written exactly one message in this thread. If others have made similar statements or cited the same link it's down to the fact that some of us care a great deal and are enthusiastic about io_uring, so this relatively recent information is top of mind.
> Docker doesn’t block IO_URING syscalls
"Update RuntimeDefault seccomp profile to disallow io_uring related syscalls"
from containerd-2.0.0 beta2 release[1], 3 weeks ago... (bear in mind there is considerable history here.)
The security problems for io_uring or for any other system calls exist only when an adversary has access to those system calls.
Whenever you write an application for internal use, where there is no possibility for an adversary to interfere and control in any way the arguments of the system calls, there is no reason to avoid io_uring or other risky system calls.
Defense in depth through sanboxing is there to make sure that even an attacker that get some amount of undue privilege will not be able to do harm. At this point we've learned that all existing systems will have vulnerabilities of some sort, so defense in depth is the only credible security posture.
“Faster than possible” is a phrase I picked up an awfully long time ago. I hope someone finds a reasonable middle ground that’s still several times faster but not Swiss cheese.
There's something super weird going on with those line numbers. They don't align for me either (Firefox on Windows 11). In the web inspector, the line numbers are rendering with Courier New as the font, whereas the code itself is `monospace`.
The weird thing is both are controlled by a single CSS rule,
Great sleuthing, the missing piece of the puzzle is that the file contents are inside a <code> element while the line numbers are not, and <code> elements have a default font so they don't inherit the font from their parent element. Changing the selector to the following fixes the issue:
div#cgit pre, div#cgit code { ... }
(The buggy CSS is not present in the the official cgit repository, so I assume the owner of kernel.dk is running a patched version of cgit.)
and yeah, this is why you don't abuse side-by-side divs ... actually, that's a table ... and instead just make your number unselectable (there are several ways to do this).
Also, you most likely are a liar. Just the matrix tables you need for MPEG-1 are 500+ lines.
Unless of course you used 1100 lines to call ffmpeg. Then you don't need the tables.
But that being said: Yes, writing a TCP proxy in 200 lines is indeed possible. But it will use standard socket calls. You might want to read up about what io_uring is all about, and why it's worth going the extra mile.
So your 200 line proxy... I wonder how it performs compared to this. Since LOC is a silly measure of anything, lets use it further: How many bytes/s/sloc does yours acheive compared to the io_uring one?