Basic proxy implementation using io_uring

bjconlan · 2024-02-18T01:09:54 1708218594

I really need to get my head across some of these things. I've been meaning to also look into https://www.kernel.org/doc/html/latest/networking/tproxy.htm... (in regards to udp(quic)/ztunnel-esque behaviour). This will serve as a great reference, thanks.

gigatexal · 2024-02-18T06:26:24 1708237584

Come on HN. Do better. Silly conversations about the number of lines? How about any C experts and networking pros weigh in on this? Has anyone built it and tested it?

SPascareli13 · 2024-02-18T04:05:31 1708229131

So, is io_uring used anywhere yet?

lukeh · 2024-02-18T07:16:58 1708240618

I’m using it for all I/O (file, socket, SPI and UART) in an embedded application written in Swift. Wrappers here:

https://github.com/PADL/IORingSwift

MrBuddyCasino · 2024-02-18T11:35:46 1708256146

What type of app can afford to run on Linux (not bare metal) but still needs such low level performance optimizations?

surajrmal · 2024-02-18T12:34:32 1708259672

Lowering CPU usage can save battery. Just because you are running Linux doesn't mean you don't have a power budget.

p_l · 2024-02-18T12:50:06 1708260606

Among many other projects I touched, we had a database engine which planned to move to io_uring as soon as it could be made available (it wasn't distributed outside company, so it was a question of internal infrastructure)

PREEMPT_RT also includes a lot of possible low-latency work.

io_uring also provides ridiculous performance benefits for normal server code, too.

sophacles · 2024-02-18T20:00:39 1708286439

Context switches are expensive. Often more expensive than operating on the data. io_uring providews a major reduction, or even near elimination of context swithes for io. It provides big win in a lot of server contexts in terms of packets/iops/throughput.

lukeh · 2024-02-19T08:13:01 1708330381

Initial motivation was lack of non-blocking SPI I/O in Linux user space. Then it made sense to use it for everything, it’s a straightforward mapping to Swift’s structured concurrency, so why not?

dankebitte · 2024-02-18T04:40:01 1708231201

Anything using glommio [1].

[1] https://crates.io/crates/glommio/reverse_dependencies

rwmj · 2024-02-18T11:31:35 1708255895

libblkio, used by qemu for some block storage, uses it. Independently, qemu also uses io_uring for some Linux file operations.

My experience with io_uring is that it requires a fairly fundamental change in the way you architect software, which makes it hard to retrofit it to existing code (eg. code using poll). If you have a greenfield project that requires very high performance and will only/mainly run on Linux then you should strongly consider it. I'd actually love to hear peoples' thoughts on whether there is a good way to retrofit code.

sbstp · 2024-02-18T06:16:39 1708236999

I think there's still a lot of concerns related to security https://www.phoronix.com/news/Google-Restricting-IO_uring

SPascareli13 · 2024-02-18T20:55:37 1708289737

Were any of this vulnerabilities disclosed? I saw that Google blocked it in some places, but without fully understanding what are these vulnerabilities it's hard to make a good judgement on whether it is usable or not in some application.

jandrewrogers · 2024-02-18T07:13:13 1708240393

There are practical backward compatibility and security issues currently that prevent widespread deployment. I’ve been leaning into io_uring going forward because it is clearly the future and brings a lot to the table. It still can’t be used in many environments but I expect it to be mainstream in 2-3 years. Niche, but not for much longer.

Most software doesn’t properly use the old io_submit facility and that has been around for decades, despite the large performance improvements possible. Even when io_uring is widely deployable, most software won’t use it.

surajrmal · 2024-02-18T12:39:41 1708259981

This only applies to Linux distro packages though right? Servers, embedded, mobile, etc only target specific kernels and so can avoid the backwards compatibility issues and in many cases have all of thr latest security patches.

p_l · 2024-02-18T12:52:20 1708260740

Yes.

But sometimes it means that what you have is software that can't run on recent enough kernel to have io_uring (^_^;;)

IAmLiterallyAB · 2024-02-18T15:40:32 1708270832

The old aio API is very inefficient and has awful quirks, such as not working on regular files opened without O_DIRECT. Not surprised no one used it

IAmLiterallyAB · 2024-02-19T00:40:54 1708303254

Am I wrong? Just noticed the downvote. That is my understanding of the old AIO kernel API anyway. The original io_uring paper describes some of the deficiencies of AIO. And the fact that the API is broken on non-O_DIRECT regular file descriptions is well known.

loeg · 2024-02-19T16:53:41 1708361621

You're not wrong.

jakewins · 2024-02-18T11:27:48 1708255668

What are the open security issues?

i_am_a_peasant · 2024-02-18T05:01:26 1708232486

I used it for some high datarate record applications at work.

Does the job nicely without even half the boilerplate of aio if you use liburing.

loeg · 2024-02-18T09:11:30 1708247490

My team uses it as part of a high performance storage system at work (but only for disk IO).

lukeh · 2024-02-18T02:14:55 1708222495

io_uring is wonderful.

topspin · 2024-02-18T07:30:42 1708241442

io_uring has produced many security problems. Google has found it necessary to either completely forego io_uring or severely limit its use to trusted code[1]. This policy includes Chrome OS, Android and Google Cloud. Docker has also blocked io_uring syscalls (by default) from being called by containers for the same reason and, although they've flip flopped on this at least once, I believe io_uring is currently blocked.

This is highly disappointing to me, but also not surprising given the nature of io_uring. Solving it will require more design rigor, probably some performance compromises and possibly some degree of formal validation.

[1] https://www.phoronix.com/news/Google-Restricting-IO_uring

FridgeSeal · 2024-02-18T10:40:52 1708252852

You’ve written the same message, almost verbatim multiple times in this thread.

Google also writes applications in golang, maintains their own version of the Linux kernel and runs a browser monopoly. What works/doesn’t work for them shouldn’t be treated as the only standard worth considering. FWIW, fb makes plentiful use of it as I understand.

Docker doesn’t block IO_URING syscalls, you just have to be able to lock memory.

Clearly you have an axe to grind about io_uring, which is fine, I guess? But you’re really going out of your way to make it seem like it’s the worst thing ever, which is a extra.

topspin · 2024-02-18T12:08:32 1708258112

> You’ve written the same message, almost verbatim multiple times in this thread.

Until now I had written exactly one message in this thread. If others have made similar statements or cited the same link it's down to the fact that some of us care a great deal and are enthusiastic about io_uring, so this relatively recent information is top of mind.

> Docker doesn’t block IO_URING syscalls

"Update RuntimeDefault seccomp profile to disallow io_uring related syscalls" from containerd-2.0.0 beta2 release[1], 3 weeks ago... (bear in mind there is considerable history here.)

What a bizarre reply.

[1] https://github.com/containerd/containerd/releases/tag/v2.0.0...

mikestew · 2024-02-18T18:34:52 1708281292

You’ve written the same message, almost verbatim multiple times in this thread

Comment history says otherwise. If you’re going to make ad hominems, at least make ‘em accurate.

surajrmal · 2024-02-18T12:44:21 1708260261

Does Facebook have the same threat model? They don't run a public cloud nor host enterprise data.

adrian_b · 2024-02-18T10:24:45 1708251885

The security problems for io_uring or for any other system calls exist only when an adversary has access to those system calls.

Whenever you write an application for internal use, where there is no possibility for an adversary to interfere and control in any way the arguments of the system calls, there is no reason to avoid io_uring or other risky system calls.

littlestymaar · 2024-02-18T11:18:52 1708255132

Defense in depth through sanboxing is there to make sure that even an attacker that get some amount of undue privilege will not be able to do harm. At this point we've learned that all existing systems will have vulnerabilities of some sort, so defense in depth is the only credible security posture.

hinkley · 2024-02-18T08:11:47 1708243907

“Faster than possible” is a phrase I picked up an awfully long time ago. I hope someone finds a reasonable middle ground that’s still several times faster but not Swiss cheese.

secondcoming · 2024-02-18T13:05:29 1708261529

I just checked one of our GCP VMs, and it seems it's supported there:

    # grep -i uring /boot/config-$(uname -r)
    CONFIG_IO_URING=y

topspin · 2024-02-18T13:51:06 1708264266

(message #3 attributable to me, since someone appears to be counting...)

"Google production servers" is what I should have written. io_uring on your GPC VMs has no security implications for Google itself.

CoolCold · 2024-02-18T03:50:13 1708228213

would be interesting to see some performance benchmarks - I do care on TCP streams mostly for now, cases I do cover by HAproxy/Nginx in the systems.

38 · 2024-02-18T01:16:56 1708219016

> basic

1300 lines...

also, the line numbers dont actually align with the lines:

https://git.kernel.dk/cgit/liburing/tree/examples/proxy.c#n1...

rwiggins · 2024-02-18T06:16:53 1708237013

There's something super weird going on with those line numbers. They don't align for me either (Firefox on Windows 11). In the web inspector, the line numbers are rendering with Courier New as the font, whereas the code itself is `monospace`.

The weird thing is both are controlled by a single CSS rule,

    div#cgit pre {
      font-family: "Source Code Pro", "Courier New", monospace;
    }

(Changing to just `monospace` fixes it.)

I'm not sure how two elements could have different fonts with that. I'm possibly missing something obscure, but it really feels like a browser bug.

hddqsb · 2024-02-23T14:10:42 1708697442

Great sleuthing, the missing piece of the puzzle is that the file contents are inside a <code> element while the line numbers are not, and <code> elements have a default font so they don't inherit the font from their parent element. Changing the selector to the following fixes the issue:

  div#cgit pre, div#cgit code { ... }

(The buggy CSS is not present in the the official cgit repository, so I assume the owner of kernel.dk is running a patched version of cgit.)

o11c · 2024-02-18T01:49:37 1708220977

FWIW it's about 800 lines excluding comments.

and yeah, this is why you don't abuse side-by-side divs ... actually, that's a table ... and instead just make your number unselectable (there are several ways to do this).

packetlost · 2024-02-18T01:58:06 1708221486

That's pretty small for a complete C application...

userbinator · 2024-02-18T05:57:00 1708235820

[flagged]

Fischgericht · 2024-02-18T15:04:41 1708268681

Or people have no respect for arrogant boasting.

Also, you most likely are a liar. Just the matrix tables you need for MPEG-1 are 500+ lines.

Unless of course you used 1100 lines to call ffmpeg. Then you don't need the tables.

But that being said: Yes, writing a TCP proxy in 200 lines is indeed possible. But it will use standard socket calls. You might want to read up about what io_uring is all about, and why it's worth going the extra mile.

Or, as an alternative, you could STFU :)

rogers12 · 2024-02-18T07:50:09 1708242609

You should go back and add error handling to your C programs

userbinator · 2024-02-18T07:56:24 1708242984

Are you implying I don't already have error handling?

Personal insults are not acceptable here.

znpy · 2024-02-18T13:07:36 1708261656

> Are you implying I don't already have error handling?

With baseless claims, no code, no other consideration? I would not imply that, I would assume that.

verticalscaler · 2024-02-18T10:28:43 1708252123

My dude, how would he know better? It is a perfectly fair assumption given the average C code base. Those numbers are unusually succinct.

You sound like you really know what you're doing, if you can, please share some of this for our edification and don't take things personally.

unmole · 2024-02-18T11:42:06 1708256526

Talk is cheap, show me the code.

znpy · 2024-02-18T13:06:32 1708261592

came to write the same thing. take my upvote.

secondcoming · 2024-02-18T13:08:03 1708261683

Your 200 line TCP proxy is likely to be very inefficient.

This is what things like DPDK and io_uring attempt to solve.

sophacles · 2024-02-18T14:35:30 1708266930

So your 200 line proxy... I wonder how it performs compared to this. Since LOC is a silly measure of anything, lets use it further: How many bytes/s/sloc does yours acheive compared to the io_uring one?

lionkor · 2024-02-18T09:23:00 1708248180

post your code

littlestymaar · 2024-02-18T11:20:12 1708255212

How is a codec a complete application?

jeroenhd · 2024-02-18T04:32:51 1708230771

It aligns perfectly for me. Must've been mostly tested on Firefox.

marwis · 2024-02-18T06:15:42 1708236942

Doesn't work on Chrome too.

Needs .linenumbers { line-height: 12pt } but I think vertical line spacing within text node cannot be precisely controlled and is browser dependent.

38 · 2024-02-18T05:01:23 1708232483

I am on Firefox...