That particular feature has been discussed in review quite a lot it seems. The latest messages on the list suggest a diifferent approach that avoids generated code, because the x86 maintainer hated it.
Has anyone actually posted some benchmarks of the jit and other methods? I haven't seen them before and it seems weird how much this has been discussed without published numbers.
The general optimization is actually not that novel: A DBMS might do that for parts of a query. At least in a high performance database lecture this was taught as a possible optimization. Edit: I'd intuitively expect the improvements to be more than 5%< though.
I have been JIT-ing query predicates in database-like systems for over a decade, and some commercial databases have supported it a lot longer than that. There are a couple relevant aspects that impact the performance benefit.
The performance gains are much higher if the database engine was designed to have JIT-ed execution from day one. Grafting it onto a database engine after the fact, like Postgres, is going to gain substantially less benefit than is theoretically possible. Additionally, it mostly benefits databases where query predicates have a large amount of data to process, it doesn’t do much for OLTP. But in the right system and context, large integer factor performance improvements are routinely achievable.
Yes, this is a thing. But the key difference there is... that's being done ... in user space.
We live in a world where just last week Google turned off io_uring access on a pile of machines, and that's only "executing" restricted sets of operations. Executable code in kernel = big giant target painted on back.
Ah, yes, that's an excellent point. Because someone else talked about 5% performance I looked at it with my performance hat on, not the security hat. OTOH we have BPFilter with its VM.
FWIW, the IBM mainframes have channel control programs, which as I understand it, can dynamically generate small programs and send them to the channel controller to execute; can include branching / conditionals also. https://en.wikipedia.org/wiki/Execute_Channel_Program
Manually constructing machine code (as in the example) is not the best idea - it is error-prone, difficult to debug, and prevents testing with sanitizers. I'd not do it.
[...]
So, without intending any particular hostility:
<puts on maintainer hat>
bcachefs's x86 JIT is:
Nacked-by: Andy Lutomirski <luto@kernel.org> # for x86
<takes off maintainer hat>
This makes me sad, because I like bcachefs. But you can get it merged
without worrying about my NAK by removing the x86 part.
[...]
> No, I'm saying your concerns are baseless and too vague to address.
If you don't address them, the NAK will stand forever, or at least until a
different group of people take over x86 maintainership. That's fine with me.
I'm generally pretty happy about working with people to get their Linux
code right. But no one is obligated to listen to me.
>
>> text_poke() by itself is *not* the proper API, as discussed. It
>> doesn't serialize adequately, even on x86. We have text_poke_sync()
>> for that.
>
> Andy, I replied explaining the difference between text_poke() and
> text_poke_sync(). It's clear you have no idea what you're talking about,
> so I'm not going to be wasting my time on further communications with
> you.
No problem. Then your x86 code will not be merged upstream.
Best of luck with the actual filesystem parts!
--Andy
[...]
> >> Andy, I replied explaining the difference between text_poke() and
> >> text_poke_sync(). It's clear you have no idea what you're talking about,
> >> so I'm not going to be wasting my time on further communications with
> >> you.
>
> One more specific concern: This comment made me very uncomfortable and
> it read to me very much like a personal attack, something which is
> contrary to our code of conduct.
It's not; I prefer to be direct than passive
aggressive, and if I have to bow out of a discussion
that isn't going anywhere I feel I owe an explanation
of _why_. Too much conflict avoidance means things
don't get resolved.
And Andy and I are talking on IRC now, so things are
proceeding in a better direction.
Yikes, I guess I’m way too sensitive to ever be a kernel dev.
Kent seems way out of line. “Direct” is one thing, telling a maintainer they have no idea what they’re talking about is another. It’s totally uncalled for.
Are they trying to imitate Linus, or something? Sorry, you don’t get a license to be an asshole until you literally invent Linux.
Dunno, having read through the rest of the thread, and knowing some of the history, I can see why some frustration leaked out. Worth noting that they were able to pretty much immediately work together after this to come up with what appears to be a good solution to satisfy everyone.
Where I'm from, having a temper tantrum on a mailing list because you're not getting your way is what makes you the snowflake. If you don't want to follow the code of conduct, you can submit your patch to a kernel without one.
Yes, that's the new mindset, about "codes of conduct" and such. It's what happens after a project has been bookstrapped, succesful, and established, and the later process-focused/bureucracy/ass-saving/touchy-feely minded people come in.
For Linux that was 15+ years after it started, did fine, and conquered the world, without needing one.
Worth noting that it also worked fine in this instance, the two parties that were in conflict worked things out and found a better solution, and the snowflake who brought up the code-of-conduct was treated as network damage and routed around.
It's vital to pay attention to the sheer volume of comments being made that all needed to be substantively responded to by one developer. The sense I get is very much that the appropriate attitude is: don't join the dog-pile if you don't want to be snapped at.
I can't see any context in the form of a discussion, it's just code, so I can't see the anticipated trade-offs etc. but I'd expect the cost of access to the Btree of any decent size (that is, overflowing cache) and large fanout to be almost completely about RAM memory latency. Therefore I'd expect compiling just the tree-accessing code to be of little value.
I wonder how much this improves performance over not "JIT'ing" the calculation. I have done similar things in image/video codec code (where it resulted in substantial increases in speed) but this is the first time I've seen it for a filesystem.
[...]
testing random btree updates:
dynamically generated unpack:
rand_insert: 20.0 MiB with 1 threads in 33 sec, 1609 nsec per iter, 607 KiB per sec
old C unpack:
rand_insert: 20.0 MiB with 1 threads in 35 sec, 1672 nsec per iter, 584 KiB per sec
the Eric Biggers special:
rand_insert: 20.0 MiB with 1 threads in 35 sec, 1676 nsec per iter, 583 KiB per sec
[...]
Alternative: 5% gain in a specific issue is totally fine to defer until another patch set. It feels like fighting about it in the first submission is a waste of time for everyone involved.
Merging into the main Linux branch (Linus’). One maintainer has said they won’t merge until the JIT is removed. If it is never merged to Linux, it will forever be niche, requiring the same tricks and having the same limitations as zfs.ko. So it seems better to be 5% slower now and widely used, than to always be niche.
All in all, while a good Phoronix benchmark is what can make it supplant all other Linux filesystems, I appreciate the security concerns raised by maintainers, and I agree that a better approach would have been to use the default code at first, and seek advice on how to improve its performance. Thankfully, it looks like that is where it is going now.
Let's see if they will switch to that.
https://lore.kernel.org/linux-bcachefs/ZIuCFtmnFturKwex@mori...