More

jmillikin · 2025-04-06T16:33:52 1743957232

  > use SECCOMP_SET_MODE_STRICT to isolate the child process. But at that
  > point, what are you even doing? Probably nothing useful.

The classic example of a fully-seccomp'd subprocess is decoding / decompression. If you want to execute ffmpeg on untrusted user input then seccomp is a sandbox that allows full-power SIMD, and the code has no reason to perform syscalls other than read/write to its input/output stream.

On the client side there's font shaping, PDF rendering, image decoding -- historically rich hunting grounds for browser CVEs.

Animats · 2025-04-06T17:11:20 1743959480

The classic example of a fully-seccomp'd subprocess is decoding / decompression.

Yes. I've run JPEG 2000 decoders in a subprocess for that reason.

WesolyKubeczek · 2025-04-06T17:29:49 1743960589

Well, it seems that lately this kind of task wants to write/mmap to a GPU, and poke at font files and interpret them.

jmillikin · 2025-03-18T06:04:41 1742277881

I flagged this for being LLM-generated garbage; original comment below. Any readers interested in benchmarking programming language implementations should visit https://benchmarksgame-team.pages.debian.net/benchmarksgame/... instead.

---

The numbers in the table for C vs Rust don't make sense, and I wasn't able to reproduce them locally. For a benchmark like this I would expect to see nearly identical performance for those two languages.

Benchmark sources:

https://github.com/naveed125/rust-vs/blob/6db90fec706c875300...

Benchmark process and results:

  $ gcc --version
  gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
  $ gcc -O2 -static -o bench-c-gcc benchmark.c
  $ clang --version
  Ubuntu clang version 14.0.0-1ubuntu1.1
  $ clang -O2 -static -o bench-c-clang benchmark.c
  $ rustc --version
  rustc 1.81.0 (eeb90cda1 2024-09-04)
  $ rustc -C opt-level=2 --target x86_64-unknown-linux-musl -o bench-rs benchmark.rs

  $ taskset -c 1 hyperfine --warmup 1000 ./bench-c-gcc
  Benchmark 1: ./bench-c-gcc
    Time (mean ± σ):       3.2 ms ±   0.1 ms    [User: 2.7 ms, System: 0.6 ms]
    Range (min … max):     3.2 ms …   4.1 ms    770 runs

  $ taskset -c 1 hyperfine --warmup 1000 ./bench-c-clang
  Benchmark 1: ./bench-c-clang
    Time (mean ± σ):       3.5 ms ±   0.1 ms    [User: 3.0 ms, System: 0.6 ms]
    Range (min … max):     3.4 ms …   4.8 ms    721 runs

  $ taskset -c 1 hyperfine --warmup 1000 ./bench-rs
  Benchmark 1: ./bench-rs
    Time (mean ± σ):       5.1 ms ±   0.1 ms    [User: 2.9 ms, System: 2.2 ms]
    Range (min … max):     5.0 ms …   7.1 ms    507 runs

Those numbers also don't make sense, but in a different way. Why is the Rust version so much slower, and why does it spend the majority of its time in "system"?

Oh, it's because benchmark.rs is performing a dynamic memory allocation for each key. The C version uses a buffer on the stack, with fixed-width keys. Let's try doing the same in the Rust version:

  --- benchmark.rs
  +++ benchmark.rs
  @@ -38,22 +38,22 @@
   }
 
   // Generates a random 8-character string
  -fn generate_random_string(rng: &mut Xorshift) -> String {
  +fn generate_random_string(rng: &mut Xorshift) -> [u8; 8] {
       const CHARSET: &[u8] = b"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
  -    let mut result = String::with_capacity(8);
  +    let mut result = [0u8; 8];
   
  -    for _ in 0..8 {
  +    for ii in 0..8 {
           let rand_index = (rng.next() % 62) as usize;
  -        result.push(CHARSET[rand_index] as char);
  +        result[ii] = CHARSET[rand_index];
       }
   
       result
   }
   
   // Generates `count` random strings and tracks their occurrences
  -fn generate_random_strings(count: usize) -> HashMap<String, u32> {
  +fn generate_random_strings(count: usize) -> HashMap<[u8; 8], u32> {
       let mut rng = Xorshift::new();
  -    let mut string_counts: HashMap<String, u32> = HashMap::new();
  +    let mut string_counts: HashMap<[u8; 8], u32> = HashMap::with_capacity(count);
   
       for _ in 0..count {
           let random_string = generate_random_string(&mut rng);

Now it's spending all its time in userspace again, which is good:

  $ taskset -c 1 hyperfine --warmup 1000 ./bench-rs
  Benchmark 1: ./bench-rs
    Time (mean ± σ):       1.5 ms ±   0.1 ms    [User: 1.3 ms, System: 0.2 ms]
    Range (min … max):     1.4 ms …   3.2 ms    1426 runs

... but why is it twice as fast as the C version?

---

I go to look in benchmark.c, and my eyes are immediately drawn to this weird bullshit:

  // Xorshift+ state variables (64-bit)
  uint64_t state0, state1;

  // Xorshift+ function for generating pseudo-random 64-bit numbers
  uint64_t xorshift_plus() {
      uint64_t s1 = state0;
      uint64_t s0 = state1;
      state0 = s0; 
      s1 ^= s1 << 23; 
      s1 ^= s1 >> 18; 
      s1 ^= s0; 
      s1 ^= s0 >> 5;
      state1 = s1; 
      return state1 + s0; 
  }

That's not simply a copy of the xorshift+ example code on Wikipedia. Is there any human in the world who is capable of writing xorshift+ but is also dumb enough to put its state into global variables? I smell an LLM.

A rough patch to put the state into something the compiler has a hope of optimizing:

  --- benchmark.c
  +++ benchmark.c
  @@ -18,25 +18,35 @@
   StringNode *hashTable[HASH_TABLE_SIZE]; // Hash table for storing unique strings
   
   // Xorshift+ state variables (64-bit)
  -uint64_t state0, state1;
  +struct xorshift_state {
  +       uint64_t state0, state1;
  +};
   
   // Xorshift+ function for generating pseudo-random 64-bit numbers
  -uint64_t xorshift_plus() {
  -    uint64_t s1 = state0;
  -    uint64_t s0 = state1;
  -    state0 = s0;
  +uint64_t xorshift_plus(struct xorshift_state *st) {
  +    uint64_t s1 = st->state0;
  +    uint64_t s0 = st->state1;
  +    st->state0 = s0;
       s1 ^= s1 << 23;
       s1 ^= s1 >> 18;
       s1 ^= s0;
       s1 ^= s0 >> 5;
  -    state1 = s1;
  -    return state1 + s0;
  +    st->state1 = s1;
  +    return s1 + s0;
   }
   
   // Function to generate an 8-character random string
   void generate_random_string(char *buffer) {
  +    uint64_t timestamp = (uint64_t)time(NULL) * 1000;
  +    uint64_t state0 = timestamp ^ 0xDEADBEEF;
  +    uint64_t state1 = (timestamp << 21) ^ 0x95419C24A637B12F;
  +    struct xorshift_state st = {
  +        .state0 = state0,
  +        .state1 = state1,
  +    };
  +
       for (int i = 0; i < STRING_LENGTH; i++) {
  -        uint64_t rand_value = xorshift_plus() % 62;
  +        uint64_t rand_value = xorshift_plus(&st) % 62;
   
           if (rand_value < 10) { // 0-9
               buffer[i] = '0' + rand_value;
  @@ -113,11 +123,6 @@
   }
   
   int main() {
  -    // Initialize random seed
  -    uint64_t timestamp = (uint64_t)time(NULL) * 1000;
  -    state0 = timestamp ^ 0xDEADBEEF; // Arbitrary constant
  -    state1 = (timestamp << 21) ^ 0x95419C24A637B12F; // Arbitrary constant
  -
       double total_time = 0.0;
   
       // Run 3 times and measure execution time

and the benchmarks now make slightly more sense:

  $ taskset -c 1 hyperfine --warmup 1000 ./bench-c-gcc
  Benchmark 1: ./bench-c-gcc
    Time (mean ± σ):       1.1 ms ±   0.1 ms    [User: 1.1 ms, System: 0.1 ms]
    Range (min … max):     1.0 ms …   1.8 ms    1725 runs
  
  $ taskset -c 1 hyperfine --warmup 1000 ./bench-c-clang
  Benchmark 1: ./bench-c-clang
    Time (mean ± σ):       1.0 ms ±   0.1 ms    [User: 0.9 ms, System: 0.1 ms]
    Range (min … max):     0.9 ms …   1.4 ms    1863 runs

But I'm going to stop trying to improve this garbage, because on re-reading the article, I saw this:

  > Yes, I absolutely used ChatGPT to polish my code. If you’re judging me for this,
  > I’m going to assume you still churn butter by hand and refuse to use calculators.
  > [...]
  > I then embarked on the linguistic equivalent of “Google Translate for code,”

Ok so it's LLM-generated bullshit, translated into other languages either by another LLM, or by a human who doesn't know those languages well enough to notice when the output doesn't make any sense.

mubou · 2025-03-18T06:19:04 1742278744

> my eyes are immediately drawn to this weird bullshit

Gave me a good chuckle there :)

Appreciate this write up; I'd even say your comment deserves its own article, tbh. Reading your thought process and how you addressed the issues was interesting. A lot of people don't know how to identify or investigate weird bullshit like this.

naveed125 · 2025-03-18T17:50:50 1742320250

So glad I had read the 2nd agreement by Don Miguel Ruiz lol.

jmillikin · 2025-03-12T16:45:02 1741797902

The "dsb nsh; isb" sequence after "svc 0" is part of OpenBSD's mitigations for Spectre.

https://github.com/openbsd/src/commit/bbeaada4689520859307d5...

https://github.com/openbsd/src/commit/0c401ffc2a2550c32105ce...

https://github.com/openbsd/src/commit/5ecc9681133f1894e81c38...

If I'm reading the commits correctly, the OpenBSD kernel will skip two instructions after a "svc 0" when returning to userspace, on the assumption that any syscall comes from libc and therefore has "dsb nsh; isb" after it.

jmillikin · 2025-03-10T08:41:49 1741596109

  > This is a very good example of how C is not "close to the machine" or
  > "portable assembly",

C is very much "portable assembly" from the perspective of other systems programming languages of the 80s-90s era. The C expression `a += 1` can be trusted to increment a numeric value, but the same expression in C++ might allocate memory or unwind the call stack or do who knows what. Similarly, `a = "a"` is a simple pointer assignment in C, but in C++ it might allocate memory or [... etc].

The phrase "C is portable assembly" isn't a claim that each statement gets compiled directly to equivalent machine code.

kryptiskt · 2025-03-10T09:32:02 1741599122

When the code has hit the IR in clang or gcc, there is no 'a' (we know that with certainty, since SSA form doesn't mutate but assigns to fresh variables). We don't know if there will be an increment of 1, the additions could be coalesced (or elided if the result can be inferred another way). The number can even decrease, say if things have been handled in chunks of 16, and needs to be adjusted down in the last chunk. Or the code may be auto-vectorized and completely rewritten, so that none of the variables at the C level are reflected on the assembler level.

jmillikin · 2025-03-10T10:04:22 1741601062

From a high-level academic view, yes, the compiler is allowed to perform any legal transformation. But in practice C compilers are pretty conservative about what they emit, especially when code is compiled without -march= .

You don't have to take my word for it. Go find a moderately complex open-source library written in C, compile it, then open up the result in Hexrays/Ghidra/radare2/whatever. Compare the compiled functions with their original source and you'll see there's not that much magic going on.

hun3 · 2025-03-10T10:27:52 1741602472

-O3 does autovectorization: turning your loops into a bunch of SIMD instructions, sometimes even drastically changing performance profile.

If autovectorization is "not that much magic" then idk what else it is.

gpderetta · 2025-03-10T17:21:33 1741627293

Any optimization you are familiar with is trivial and expected. Everything else is broken compilers optimizing UB to win benchmarks.

lifthrasiir · 2025-03-10T12:49:16 1741610956

Nowadays it's -O2. I was also surprised when I first learned this.

cogman10 · 2025-03-10T13:46:37 1741614397

They are as aggressive as they can be.

Here's an example of a C compiler completely eliminating a loop because it has figure out how to transform the loop into a constant calculation.

https://godbolt.org/z/cfndqMj4j

The place where C compilers are conservative is when dealing with arrays and pointers. That's because it's impossible for C to know if a pointer is to an element of an array or something completely different. Pointer math further complicates what a pointer could actually reference.

titzer · 2025-03-10T19:54:32 1741636472

Saying that something "is like XY" when you really mean "is like XY, at least in comparison to C++" isn't what most people mean.

C is not a portable assembler.

In C, "a += 1" could overflow, and signed overflow is undefined behavior--even though every individual ISA has completely defined semantics for overflow, and nearly all of them these days do two's complement wraparound arithmetic. With C's notion of undefined behavior, it doesn't even give you the same wraparound in different places in the same program. In fact, wraparound is so undefined that the program could do absolutely anything, and the compiler is not required to even tell you about it. Even without all the C++ abstraction madness, a C compiler can give you absolutely wild results due to optimizations, e.g. by evaluating "a += 1" at compile time and using a different overflow behavior than the target machine. Compile-time evaluation not matching runtime evaluation is one of a huge number of dumb things that C gives you.

Another is that "a += 1" may not even increment the variable. If this occurs as an expression, and not as a statement, e.g. "f(a += 1, a += 1)", you might only get one increment due to sequence points[1]--not to mention that the order of evaluation might be different depending on the target.

C is not a portable assembler.

C is a low-level language where vague machine-like programs get compiled to machine code that may or may not work, depending on whether it violates UB rules or not, and there are precious few diagnostics to tell if that happened, either statically or dynamically.

[1] https://en.wikipedia.org/wiki/Sequence_point

pjc50 · 2025-03-10T10:13:41 1741601621

> The phrase "C is portable assembly" isn't a claim that each statement gets compiled directly to equivalent machine code.

Weasel words. Like a "self driving car" that requires a human driver with constant attention willing to take over within a few hundred milliseconds.

People advocate for C and use it in a way that implies they think it can achieve specific machine outcomes, and it usually does .. except when it doesn't. If people want a portable assembler they should build one.

jmillikin · 2025-03-10T11:20:18 1741605618

As a general rule if you're reading a technical discussion and every single participant is using a particular phrase in a way that doesn't make sense to you then you should probably do a quick double-check to make sure you're on the same page.

For example, in this discussion about whether C is "portable assembly", you might be tempted to think back to the days of structured programming in assembly using macros. I no longer remember the exact syntax, but programs could be written to look like this:

  .include "some-macro-system.s"
  .include "posix-sym.s"

  .func _start(argc, argv) {
    .asciz message "Hello, world!"
    .call3 _write STDOUT message (.len message)
    .call1 _exit 0
  }

Assembly? Definitely! Portable? Eh, sort of! If you're willing to restrict yourself to DOS + POSIX and write an I/O abstraction layer then it'll probably run on i386/SPARC/Alpha/PA-RISC.

But that's not really what people are discussing, is it?

When someone says "C is portable assembly" they don't mean you can take C code and run it through a platform-specific macro expander. They don't mean it's literally a portable dialect of assembly. They expect the C compiler to perform some transformations -- maybe propagate some constants, maybe inline a small function here and there. Maybe you'd like to have named mutable local variables, which requires a register allocator. Reasonable people can disagree about exactly what transformations are legal, but at that point it's a matter of negotiation.

Anyway, now you've got a language that is more portable than assembler macros but still compiles more-or-less directly to machine code -- not completely divorced from the underlying hardware like Lisp (RIP Symbolics). How would you describe it in a few words? "Like assembly but portable" doesn't seem unreasonable.

pjc50 · 2025-03-10T14:40:56 1741617656

> still compiles more-or-less directly to machine code

There's a lot hiding in "more or less". The same kind of example holds for e.g. C# : https://godbolt.org/noscript/csharp ; if you hit "Compile" it'll give you the native binary. If you write "x+1" it'll generate an add .. or be optimized away. Now does that mean it's portable assembler? Absolutely not.

Conversely there's a bunch of things that people expect to do in C, do in real code, but are not in the standard or are undefined or implementation-defined. As well as things that are present in assemblers for various platforms (things like the overflow flag) which aren't accessible from the C language.

What people actually seem to mean by "portable assembler" is "no guardrails". Memory unsafety as a feature.

> Reasonable people can disagree about exactly what transformations are legal, but at that point it's a matter of negotiation

And a matter of CVEs when you lose your negotiation with the compiler. Or less dramatic things like the performance fluctuations under discussion.

pjmlp · 2025-03-10T10:35:27 1741602927

Systems languages that predated C already could do that, that is the typical myth.

eru · 2025-03-10T08:46:34 1741596394

> The C expression `a += 1` can be trusted to increment a numeric value, [...]

Have you heard of undefined behaviour?

jmillikin · 2025-03-10T08:56:33 1741596993

Show me a C compiler that miscompiles the following code and I'll concede the point:

  uint32_t add_1(uint32_t a) {
    a += 1;
    return a;
  }

WJW · 2025-03-10T09:46:17 1741599977

C might be low level from the perspective of other systems languages, but that is like calling Apollo 11 simple from the perspective of modern spacecraft. C as written is not all that close to what actually gets executed.

For a small example, there are many compilers who would absolutely skip incrementing 'a' in the following code:

  uint32_t add_and_subtract_1(uint32_t a) {
    a += 1;
    a -= 1;
    return a;
  }

Even though that code contains `a += 1;` clear as day, the chances of any incrementing being done are quite small IMO. It gets even worse in bigger functions where out-of-order execution starts being a thing.

mkoubaa · 2025-03-10T12:07:00 1741608420

If you want the compiler to treat your code as literal portable assembly turn off optimizations.

eru · 2025-03-11T02:05:38 1741658738

That's a property of a particular compiler, not of the C language (at least as described in the standard).

johnisgood · 2025-03-10T13:50:37 1741614637

Exactly, pretty much what I said, or enable / disable the optimizations you want.

johnisgood · 2025-03-10T12:00:35 1741608035

Why would you want it to increment 1 if we decrement 1 from the same variable? That would be a waste of cycles and a good compiler knows how to optimize it out, or what am I misunderstanding here? What do you expect "it" to do and what does it really do?

See: https://news.ycombinator.com/item?id=43320495

vikramkr · 2025-03-10T12:57:03 1741611423

I'm pretty sure that's replying directly to the comment about how c is close to assembly and that if you add that line of code somewhere you know there's a variable getting incremented. Doesn't really matter whether or not it's useful, the point is that the behavior isn't exactly what you wrote

jmillikin · 2025-03-10T13:25:24 1741613124

To reiterate, claiming that C can be described as "portable assembly" is not a claim that it is literally a package of assembler macros that emit deterministic machine code for each individual source expression.

I linked these in another comment, but here's some examples of straightforward-looking integer addition emitting more complex compiler output for other languages that compile to native code:

Haskell: https://godbolt.org/z/vdeMKMETT

C++: https://godbolt.org/z/dedcof9x5

eru · 2025-03-11T02:03:59 1741658639

The Haskell version deals with both laziness and also detects overflow.

Your C++ example has a lot more code than the C example, I'm not sure why you'd expect it to produce the same output?

johnisgood · 2025-03-10T13:40:20 1741614020

https://godbolt.org/z/r39jK1ddv

It increments, then decrements with -O0 though.

I do not see the issue still, as the behavior is expected with -O0; increments then decrements.

eru · 2025-03-11T01:59:31 1741658371

There's nothing in the C standard that enforces the observed -O0 behaviour. Your compiler might change tomorrow.

johnisgood · 2025-03-11T07:16:15 1741677375

How likely is that to happen, and in which languages can you either optimize or not AND where the compiler might not change tomorrow though?

David Hume said that we cannot know if the sun is going to rise tomorrow just because it has always did before. See "problem of induction", https://philosophynow.org/issues/160/Humes_Problem_of_Induct....

eru · 2025-03-12T12:59:06 1741784346

The C standard guarantees certain behaviours that will not change, even if your C compiler changes. That's the whole point of the standard. And it has nothing to do with the problem of induction.

But the standard does not guarantee that specific assembly instructions will be used.

johnisgood · 2025-03-12T15:31:13 1741793473

Sure, but what programming language or its standard guarantees it (and its compilers in practice, of course), then?

You said "Your compiler might change tomorrow.", but does it not apply to EVERY programming language's compiler?

eru · 2025-03-13T01:28:10 1741829290

> You said "Your compiler might change tomorrow.", but does it not apply to EVERY programming language's compiler?

Yes. I wasn't the one trying to argue that C is special in this regard. Just the opposite.

acdha · 2025-03-10T12:51:46 1741611106

That’s a contrived example but in a serious program there would often be code in between or some level of indirection (e.g. one of those values is a lookup, a macro express, or the result of another function).

Nothing about that is cheating, it just says that even C programmers cannot expect to look at the compiled code and see a direct mapping from their source code. Your ability to reason about what’s actually executing requires you to internalize how the compiler works in addition to your understanding of the underlying hardware and your application.

ses1984 · 2025-03-11T02:58:26 1741661906

What optimizer would remove the increment/decrement if the value was accessed in between? That seems like something that would be really easy to detect.

I’ve never studied compilers though.

remexre · 2025-03-11T04:18:24 1741666704

It would be very normal for a compiler to do an increment (or merge it into a later instruction), but never do the decrement, and instead use the old copy of the value.

WJW · 2025-03-11T08:56:14 1741683374

Then in the next step, it would see that the result of the increment is never used and thus the increment instruction is dead code and can also be removed.

johnisgood · 2025-03-10T13:37:44 1741613864

In what languages can you do that that is not assembly though? The higher level the language is, the "worse" or difficult it gets, perhaps I am not following the thread right.

acdha · 2025-03-10T14:28:42 1741616922

Yes, this is normal for languages. The only pushback here is against the term “portable assembler” being applied to C, where it’s incomplete often enough that many people feel it’s no longer a helpful label.

I think it’s also reflecting the maturity and growth of the industry. A turn of the century programmer could relatively easily find areas where dropping down to assembly was useful, but over the subsequent decades that’s become not only uncommon but often actively harmful: your code hand-optimized for a particular processor is likely slower on newer processors than what a modern compiler emits and is definitely a barrier to portability in an era where not only are ARM and potentially RISC-V of interest but also where code is being run on SIMD units or GPUs. This makes the low-level “portable assembler” idea less useful because there’s less code written in that middle ground when you want either a higher-level representation which gives compilers more flexibility or precise control. For example, cryptography implementers want not just high performance but also rigid control of the emitted code to avoid a compiler optimizing their careful constant-time implementation into a vulnerability.

chasd00 · 2025-03-10T13:28:18 1741613298

I'm not an embedded expert but a friend of mine has complained about compiler optimizations breaking things in his programs. I could see incrementing by one being used to set some bits in a memory location for a cycle that may mean something to some peripheral and then decrementing by one to set some other bits that may mean something else. In that case, the compiler removing those two lines would cause a very hard to debug issue.

johnisgood · 2025-03-10T13:45:08 1741614308

I understand compiler optimizations having unintended consequences, e.g. https://godbolt.org/z/r39jK1ddv but there are a lot of options he may use to enable or disable optimizations (assuming GCC here): https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

This is even more difficult to do with higher-level languages.

lifthrasiir · 2025-03-10T12:50:09 1741611009

It is unlikely as is, but it frequently arises from macro expansions and inlining.

eru · 2025-03-10T10:15:04 1741601704

> It gets even worse in bigger functions where out-of-order execution starts being a thing.

In addition, add that your processor isn't actually executing x86 (nor ARM etc) instructions, but interprets/compiles them to something more fundamental.

So there's an additional layer of out-of-order instructions and general shenanigans happening. Especially with branch prediction in the mix.

gpderetta · 2025-03-10T10:43:12 1741603392

If my misocompile, you mean that it fails the test that a "C expression `a += 1` can be trusted to increment a numeric value", then it is trivial: https://godbolt.org/z/G5dP9dM5q

Retr0id · 2025-03-10T13:49:59 1741614599

Here's another one, just for fun https://godbolt.org/z/TM1Ke4d5E

jmillikin · 2025-03-10T13:58:15 1741615095

I was being somewhat terse.

The (implied) claim is that the C standard has enough sources of undefined behavior that even a simple integer addition can't be relied upon to actually perform integer addition.

But the sources of undefined behavior for integer addition in C are well-known and very clear, and any instruction set that isn't an insane science project is going to have an instruction to add integers.

Thus my comment. Show me a C compiler that takes that code and miscompiles it. I don't care if it returns a constant, spits out an infinite loop, jumps to 0x0000, calls malloc, whatever. Show me a C compiler that takes those four lines of C code and emits something other than an integer addition instruction.

Retr0id · 2025-03-10T20:18:50 1741637930

Why are you talking about miscompilation? While the LLVM regression in the featured article makes the code slower, it is not a miscompilation. It is "correct" according to the contract of the C language.

tux3 · 2025-03-10T09:24:49 1741598689

You show one example where C doesn't have problems, but that's a much weaker claim than it sounds. "Here's one situation where this here gun won't blow your foot off!"

For what it's worth, C++ also passes your test here. You picked an example so simple that it's not very interesting.

jmillikin · 2025-03-10T10:15:39 1741601739

'eru implied `a += 1` has undefined behavior; I provided a trivial counter-example. If you'd like longer examples of C code that performs unsigned integer addition then the internet has many on offer.

I'm not claiming that C (or C++) is without problems. I wrote code in them for ~20 years and that was more than enough; there's a reason I use Rust for all my new low-level projects. In this case, writing C without undefined behavior requires lots of third-party static analysis tooling that is unnecessary for Rust (due to being built in to the compiler).

But if you're going to be writing C as "portable assembly", then the competition isn't Rust (or Zig, or Fortran), it's actual assembly. And it's silly to object to C having undefined behavior for signed integer addition, when the alternative is to write your VM loop (or whatever) five or six times in platform-specific assembly.

eru · 2025-03-11T02:08:08 1741658888

Yes, 'a += 1' can have undefined behaviour in C when you use signed integers. (And perhaps also with floats? I don't remember.)

Your original comment didn't specify that you want to talk about unsigned integers only.

eru · 2025-03-10T10:25:25 1741602325

Forth might be a better competition for 'portable assembly', though.

eru · 2025-03-10T09:26:13 1741598773

Actually even here, C has some problems (and C++), too:

I don't think the standard says much about how to handle stack overflows?

eru · 2025-03-10T09:14:27 1741598067

I know that for 'int a' the statement 'a += 1' can give rather surprising results.

And you made a universal statement that 'a += 1' can be trusted. Not just that it can sometimes be trusted. In C++ the code you gave above can also be trusted as far as I can tell. At least as much as the C version.

jmillikin · 2025-03-10T09:40:58 1741599658

I'll expand my point to be clearer.

In C there is no operator overloading, so an expression like `a += 1` is easy to understand as incrementing a numeric value by 1, where that value's type is one of a small set of built-in types.

You'd need to look further up in the function (and maybe chase down some typedefs) to see what that type is, but the set of possible types generally boils down to "signed int, unsigned int, float, pointer". Each of those types has well-defined rules for what `+= 1` means.

That means if you see `int a = some_fn(); assert(a < 100); a += 1` in the C code, you can expect something like `ADD EAX,1` somewhere in the compiler output for that function. Or going the other direction, when you're in a GDB prompt and you disassemble the current EIP and you see `ADD EAX,1` then you can pretty much just look at the C code and figure out where you are.

---

Neither of those is true in C++. The combination of completely ad-hoc operator overloading, function overloading, and implicit type conversion via constructors means that it can be really difficult to map between the original source and the machine code.

You'll have a core dump where EIP is somewhere in the middle of a function like this:

  std::string some_fn() {
    some_ns::unsigned<int> a = 1;
    helper_fn(a, "hello");
    a += 1;
    return true;
  }

and the disassembly is just dozens of function calls for no reason you can discern, and you're staring at the return type of `std::string` and the returned value of `true`, and in that moment you'll long for the happy days when undefined behavior on signed integer overflow was the worst you had to worry about.

acdha · 2025-03-10T13:11:11 1741612271

> That means if you see `int a = some_fn(); assert(a < 100); a += 1` in the C code, you can expect something like `ADD EAX,1` somewhere in the compiler output for that function.

I completely agree that C++ is orders of magnitude worse but I’ve seen at least a couple counter-examples with code almost that simple. A researcher I used to support compared each release against a set of reference results, and got a surprise when they didn’t match but his program was working. This turned out to be a new compiler release being smart enough to inline and reorder his code to use a fused multiply-add instruction, which had greater internal precision and so the result was very slightly different from his saved referenced set. GCC has -fexcess-precision=standard for this but you have to understand the problem first.

MaulingMonkey · 2025-03-10T10:34:39 1741602879

    error: could not convert 'true' from 'bool' to 'std::string' {aka 'std::__cxx11::basic_string<char>'}

I don't think anyone's claiming C nor C++'s dumpster fires have signed integer overflow at the top of the pile of problems, but when the optimizer starts deleting security or bounds checks and other fine things - because of signed integer overflow, or one of the million other causes of undefined behavior - I will pray for something as straightforward as a core dump, no matter where EIP has gone.

Signed integer overflow UB is the kind of UB that has a nasty habit of causing subtle heisenbugfuckery when triggered. The kind you might, hopefully, make shallow with ubsan and good test suite coverage. In other words, the kind you won't make shallow.

jmillikin · 2025-03-10T11:33:14 1741606394

For context, I did not pick that type signature at random. It was in actual code that was shipping to customers. If I remember correctly there was some sort of bool -> int -> char -> std::string path via `operator()` conversions and constructors that allowed it to compile, though I can't remember what the value was (probably "\x01").

---

My experience with the C/C++ optimizer is that it's fairly timid, and only misbehaves when the input code is really bad. Pretty much all of the (many, many) bugs I've encountered and/or written in C would have also existed if I'd written directly in assembly.

I know there are libraries out there with build instructions like "compile with -O0 or the results will be wrong", but aside from the Linux kernel I've never encountered developers who put the blame on the compiler.

MaulingMonkey · 2025-03-10T12:47:40 1741610860

> but aside from the Linux kernel I've never encountered developers who put the blame on the compiler.

I encounter them frequently.

99.99% of the time it's undefined behavior and they're "wrong".

Frequently novices who have been failed by their teachers and documentation (see previous rant using atoi as an example of the poor quality of documentation about UB: https://news.ycombinator.com/item?id=14861917 .)

Less frequently, it's experienced devs half joking out of a need for catharsis.

Rarely, experienced devs finally getting to the end of their rope, and are finally beginning to seriously consider if they've got a codegen bug. They don't, but they're considering it. They know they were wrong the last 10 times they considered it, but they're considering it again damnit!

The linux kernel devs aren't quite unique in "just because you can, doesn't mean you should"ing their way into blaming the compiler for what could be argued to be defects in the standard or fundamental design of the language (the defect being making UB so common), but that's probably among the rarest slice of the pie of people blaming the compiler for UB. Few have the will to tilt at that windmill and voice their opinions when the compiler devs can easily just blame the standard - better to keep such unproductive rants close to heart instead, or switch to another language. Something actually productive.

0.01% of the time, it's a legitimate codegen bug on well-defined behavior code. Last one I tracked down to a bug tracker, was MSVC miscompiling 4x4 matrix multiplications by failing to spill a 17th value to stack when it only had 16 SSE register to work with. Caught by unit tests, but not by CI, since people updated compiler versions at their own random pace, and who runs `math_tests` on their personal machines when they're not touching `math`?

eru · 2025-03-10T09:46:19 1741599979

I heartily agree that C++ is a lot more annoying here than C, yes.

I'm just saying that C is already plenty annoying enough by itself, thanks eg to undefined behaviour.

> That means if you see `int a = some_fn(); assert(a < 100); a += 1` in the C code, you can expect something like `ADD EAX,1` somewhere in the compiler output for that function. Or going the other direction, when you're in a GDB prompt and you disassemble the current EIP and you see `ADD EAX,1` then you can pretty much just look at the C code and figure out where you are.

No, there's no guarantee of that. C compilers are allowed to do all kinds of interesting things. However you are often right enough in practice, especially if you run with -O0, ie turn off the optimiser.

See eg https://godbolt.org/z/YY69Ezxnv and tell me where the ADD instruction shows up in the compiler output.

gpderetta · 2025-03-10T10:46:45 1741603605

Counterexample: https://godbolt.org/z/z3jqjPT6o

jmillikin · 2025-03-10T11:46:13 1741607173

I'm not sure that's a counter-example -- what assembly do you think should be emitted for floating-point math on an AVR microcontroller?

gpderetta · 2025-03-10T12:39:30 1741610370

It means that "a += 1` is easy to understand as incrementing a numeric value by 1" is not true and instead "it can be really difficult to map between the original source and the machine code".

More examples of non-trivial mapping from C code to generated code: https://godbolt.org/z/jab6vh6dM

jmillikin · 2025-03-10T13:12:10 1741612330

All of those look pretty straightforward to me -- again, what assembly would you expect to be emitted in those cases?

For contrast, here's the assembly generated for Haskell for integer addition: https://godbolt.org/z/vdeMKMETT

And here's assembly for C++: https://godbolt.org/z/dedcof9x5

gpderetta · 2025-03-10T13:26:36 1741613196

> All of those look pretty straightforward to me -- again, what assembly would you expect to be emitted in those cases?

It is very straightforward indeed, but it is still not mapping primitive operations to direct machine code, but it is forwarding to out-of-line code. Same as operator overloading in other languages.

> And here's assembly for C++: https://godbolt.org/z/dedcof9x5

That's just a symptom of allowing the compiler to inline the add code, otherwise the generated code is as straightforward:

   addOne(Int):
    push   rax
    mov    esi,0x1
    call   4010c0 <add_safe(int, int)>

Ref: https://godbolt.org/z/xo1es9TcW

jmillikin · 2025-03-10T13:38:44 1741613924

  > It is very straightforward indeed, but it is still not mapping primitive
  > operations to direct machine code, but it is forwarding to out-of-line code.
  > Same as operator overloading in other languages.

I am not claiming that C is a collection of assembler macros. There is no expectation that a C compiler emit machine code that has exact 1:1 correspondence with the input source code.

  > Same as operator overloading in other languages.

The lack of operator overloading, and other hidden complex control flow, is the reason that someone can read C code and have a pretty good idea of what it compiles to.

  > That's just a symptom of allowing the compiler to inline the add code,
  > otherwise the generated code is as straightforward:

No, that's just moving the instructions around. You've still got dynamic allocation and stack-unwinding being generated for a line that doesn't have any sign of entering a complex control flow graph.

pjmlp · 2025-03-10T14:29:39 1741616979

> ... and other hidden complex control flow,....

Until someone calls longjmp() or a signal() is triggered. Extra bonus of fun if it happens to be multithreaded application, or in the middle of a non-rentrant call.

gpderetta · 2025-03-11T12:04:18 1741694658

Or your emulated float or atomic relies on non-local state, like a control word or a spin-lock pool.

TinkersW · 2025-03-10T09:48:37 1741600117

a+=1 will not produce any surprising results, signed integer overflow is well defined on all platforms that matter.

And we all know about the looping behavior, it isn't surprising.

The only surprising part would be if the compiler decides to use inc vs add, not that it really matters to the result.

eru · 2025-03-10T10:06:50 1741601210

> a+=1 will not produce any surprising results, signed integer overflow is well defined on all platforms that matter.

I'm not sure what you are talking about?

There's a difference between how your processor behaves when given some specific instructions, and what shenanigans your C compiler gets up to.

See eg https://godbolt.org/z/YY69Ezxnv and tell me where the ADD instruction shows up in the compiler output. Feel free to pick a different compiler target than Risc-V.

jmillikin · 2025-03-10T10:30:40 1741602640

I don't think "dead-code elimination removes dead code" adds much to the discussion.

If you change the code so that the value of `a` is used, then the output is as expected: https://godbolt.org/z/78eYx37WG

xmcqdpt2 · 2025-03-10T11:28:05 1741606085

The parent example can be made clearer like this: https://godbolt.org/z/MKWbz9W16

Dead code elimination only works here because integer overflow is UB.

jmillikin · 2025-03-10T11:59:02 1741607942

Take a closer look at 'eru's example and my follow-up.

He wrote an example where the result of `a+1` isn't necessary, so the compiler doesn't emit an ADDI even though the literal text of the C source contains the substring "a += 1".

Your version has the same issue:

  unsigned int square2(unsigned int num) {
      unsigned int a = num;
      a += 1;
      if (num < a) return num * num;
      return num;
  }

The return value doesn't depend on `a+1`, so the compiler can optimize it to just a comparison.

If you change it to this:

  unsigned int square2(unsigned int num) {
      unsigned int a = num;
      a += 1;
      if (num < a) return num * a;
      return num;
  }

then the result of `a+1` is required to compute the result in the first branch, and therefore the ADDI instruction is emitted.

The (implied) disagreement is whether a language can be considered to be "portable assembly" if its compiler elides unnecessary operations from the output. I think that sort of optimization is allowed, but 'eru (presumably) thinks that it's diverging too far from the C source code.

gpderetta · 2025-03-10T13:29:07 1741613347

In what world the return value doesn't depends on 'a' in this code?

  if (num < a) return num * num;
  /*else*/    return num;

A control dependency is still a dependency

jmillikin · 2025-03-10T14:08:02 1741615682

`a = num; a += 1; if (num < a)` is the same as `if (num < (num + 1))`, which for unsigned integer addition can be rewritten as `if (num != UINT_MAX)`. So there's no need to actually compute `a+1`, the comparison is against a constant.

If the code returns `num * a` then the value of `a` is now necessary, and must be computed before the function returns.

For signed integer addition the compiler is allowed to assume that `(num < (num + 1))` is true, so the comparison can be removed entirely.

eru · 2025-03-11T02:14:36 1741659276

> For signed integer addition the compiler is allowed to assume that `(num < (num + 1))` is true, so the comparison can be removed entirely.

That's not directly what the compiler assumes. The direct problem is in 'a + 1' having undefined behaviour, and that transitively allows the assumption on the comparison that you mentioned.

This was an example where 'a + 1' doesn't compile to an add instruction.

xmcqdpt2 · 2025-03-11T12:26:42 1741696002

C compilers are just too smart IMO to make C portable assembly. Your example doesn’t always ADDI either, for example if it is inlined

https://godbolt.org/z/vv9rvKsxn

This isn’t qualitatively different from what the JVM JIT would do, but Java isn’t considered portable assembly.

I guess if you compile with optimizations completely off, you get something that is assembly-like, but I’ve never seen that in prod code.

eru · 2025-03-12T13:02:06 1741784526

> He wrote an example where the result of `a+1` isn't necessary, so the compiler doesn't emit an ADDI even though the literal text of the C source contains the substring "a += 1".

No, the result of the 'a+1' is necessary in my version. And if you change the type from 'int' to 'unsigned' you will see that the compiler no longer just omits the addition.

IshKebab · 2025-03-10T11:23:15 1741605795

That will be inline by any C compiler and then pretty much anything can happen to the 'a += 1'.

jmillikin · 2025-03-08T06:33:13 1741415593

  > I’m all for multiple backends but there should be only 1 frontend. That’s
  > why I hope gccrs remains forever a research project - it’s useful to help
  > the Rust language people find holes in the spec but if it ever escapes the
  > lab expect Rust to pick up C++ disease.

An important difference between Rust and C++ is that Rust maintains a distinction between stable and unstable features, with unstable features requiring a special toolchain and compiler pragma to use. The gccrs developers have said on record that they want to avoid creating a GNU dialect of Rust, so presumably their plan is to either have no gccrs-specific features at all, or to put such features behind an unstable #![feature] pragma.

  > Rust with a gcc backend is fine for when you want gcc platform support
  > - a duplicate frontend with its own quirks serves no purpose.

A GCC-based Rust frontend would reduce the friction needed to adopt Rust in existing large projects. The Linux kernel is a great example, many of the Linux kernel devs don't want a hard dependency on LLVM, so they're not willing to accept Rust into their part of the tree until GCC can compile it.

vlovich123 · 2025-03-08T18:27:41 1741458461

Dialects are created not just because of different feature sets, but also because of different interpretations of the spec / bugs. Similarly, if Rust adds a feature, it’ll take time for gccrs to port that feature - that’s a dialect or Rust becomes a negotiation of getting gccrs to adopt the feature unless you really think gccrs will follow the Rust compiler with the same set of features implemented in a version (ie tightly coupled release cycles). It’s irrelevant of the intentions - that’s going to be the outcome.

> A GCC-based Rust frontend would reduce the friction needed to adopt Rust in existing large projects. The Linux kernel is a great example, many of the Linux kernel devs don't want a hard dependency on LLVM, so they're not willing to accept Rust into their part of the tree until GCC can compile it.

How is that use case not addressed by rust_codegen_gcc? That seems like a much more useful effort for the broader community to focus on that delivers the benefits of gcc without bifurcating the frontend.

jmillikin · 2025-03-08T06:24:48 1741415088

Note that becoming an international standard (via ISO, ECMA, IETF, or whatever) isn't necessary or sufficient to avoid dialects.

If the Rust language specification is precise enough to avoid disagreements about intended behavior, then multiple compilers can be written against that spec and they can all be expected to correctly compile Rust source code to equivalent output. Even if no international standards body has signed off on it.

On the other hand, if the spec is incomplete or underspecified, then even an ANSI/ISO/IETF stamp of approval won't help bring different implementations into alignment. C/C++ has been an ISO standard for >30 years and it's still difficult to write non-trivial codebases that can compile without modification on MSVC, GCC, Clang, and ICC because the specified (= portable) part of the language is too small to use exclusively.

Or hell, look at JSON, it's tiny and been standardized by the IETF but good luck getting consistent parsing of numeric values.

jmillikin · 2025-03-08T02:39:53 1741401593

Rust's inline assembly syntax is part of the language, and in principle the same Rust source would compile on any conforming compiler (rustc, gccrs).

C/C++ doesn't have a standard syntax for inline assembly. Clang and GCC have extensions for it, with compiler-specific behavior and syntax.

wakawaka28 · 2025-03-08T05:20:23 1741411223

I mentioned somewhere else but I might as well mention here too: there is no standard assembler that everyone uses. Each one may have a slightly different syntax, even for the same arch, and at least some C++ compilers allow you to customize the assembler used during compilation. Therefore, one would assume that inline assembly can't be uniform in general, without picking a single assembler (even assembler version) for each arch.

jmillikin · 2025-03-08T05:59:46 1741413586

You're talking about the syntax of the assembly code itself. In practice small variations between assemblers isn't much of a problem for inline assembly in the same way it would be for standalone .s sources, because inline assembly rarely has implementation-specific directives and macros and such. It's not like the MASM vs NASM split.

This thread is about the compiler-specific syntax used to indicate the boundary between C and assembly and the ABI of the assembly block (register ins/outs/clobbers). Take a look at the documentation for MSVC vs GCC:

https://learn.microsoft.com/en-us/cpp/assembler/inline/asm?v...

https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Rust specifies the inline assembly syntax at https://doc.rust-lang.org/reference/inline-assembly.html in great detail. It's not a rustc extension, it's part of the Rust language spec.

wakawaka28 · 2025-03-08T19:21:16 1741461676

>This thread is about the compiler-specific syntax used to indicate the boundary between C and assembly and the ABI of the assembly block (register ins/outs/clobbers).

I see... Nevertheless, this is a really weird issue to get bent out of shape over. How many people are really writing so much inline assembly and also needing to support multiple compilers with incompatible syntax?

jmillikin · 2025-03-09T00:31:08 1741480268

Biggest category of libraries that need inline assembly with compiler portability are compression/decompression codecs (like the linked article) -- think of images (PNG, JPEG), audio (MP3, Opus, FLAC), video (MPEG4, H.264, AV1).

Also important is cryptography, where inline assembly provides more deterministic performance than compiler-generated instructions.

Compiler intrinsics can get you pretty far, but sometimes dropping down to assembly is the only solution. In those times, inline assembly can be more ergonomic than separate .s source files.

vlovich123 · 2025-03-08T05:48:10 1741412890

Exactly. It picks a single assembler:

> Currently, all supported targets follow the assembly code syntax used by LLVM’s internal assembler which usually corresponds to that of the GNU assembler (GAS)

Uniformity like that is a good thing when you need to ensure that your code compiles consistently in a supported manner forever. Swapping out assemblers isn’t helpful for inline assembly.

jmillikin · 2025-03-08T06:13:29 1741414409

The quoted statement is weaker than what you're reading it as, I think. It's not a statement that emitted assembly code is guaranteed to conform to LLVM syntax, it's just noting that (1) at present, (2) for supported targets of the rustc implementation, the emitted assembly uses LLVM syntax.

Non-LLVM compilers like gccrs could support platforms that LLVM doesn't, which means the assembly syntax they emit would definitionally be non-LLVM. And even for platforms supported by both backends, gccrs might choose to emit GNU syntax.

Note also that using a non-builtin assembler is sometimes necessary for niche platforms, like if you've got a target CPU that is "MIPS plus custom SIMD instructions" or whatever.

estebank · 2025-03-08T14:37:55 1741444675

I didn't follow up the stabilization process very closely, but I believe you're wrong. What you're describing is what used to be asm! and is now llvm_asm!. The current stable asm! syntax actually parses its own assembly instead of passing it through to the backend unchanged. This was done explicitly to allow for non-llvm backends to work, and for alternative front-ends to be able to be compatible. I saw multiple statements on this thread about alternative compilers or backends causing trouble here, and that's just not the case given the design was delayed for ages until those issues could be addressed.

Given that not all platforms that are supported by rust have currently support for asm!, I believe your last paragraph does still apply.

https://rust-lang.github.io/rfcs/2873-inline-asm.html

jmillikin · 2025-03-08T16:52:42 1741452762

This sentence from the Reference is important:

  > The exact assembly code syntax is target-specific and opaque to the compiler
  > except for the way operands are substituted into the template string to form
  > the code passed to the assembler.

You can verify that rustc doesn't validate the contents of asm!() by telling it to emit the raw LLVM IR:

  % cat bogus.rs
  #![no_std]
  pub unsafe fn bogus_fn() {
   core::arch::asm!(".bogus");
   core::arch::asm!("bogus");
  }
  % rustc --crate-type=lib -C panic=abort --emit=llvm-ir -o bogus.ll bogus.rs
  % cat bogus.ll
  [...]
  ; bogus::bogus_fn
  ; Function Attrs: nounwind
  define void @_ZN5bogus8bogus_fn17h0e38c0ae539c227fE() unnamed_addr #0 {
  start:
    call void asm sideeffect alignstack ".bogus", "~{cc},~{memory}"(), !srcloc !2
    call void asm sideeffect alignstack "bogus", "~{cc},~{memory}"(), !srcloc !3
    ret void
  }

That IR is going to get passed to llvm-as and possibly onward to an external assembler, which is where the actual validation of instruction mnemonics and assembler directives happens.

---

The difference between llvm_asm!() and asm!() is in the syntax of the stuff outside of the instructions/directives -- LLVM's "~{cc},~{memory}" is what llvm_asm!() accepts more-or-less directly, and asm!() generates from backend-independent syntax.

I have an example on my blog of calling Linux syscalls via inline assembly in C, LLVM IR, and Rust. Reading it might help clarify the boundary: https://john-millikin.com/unix-syscalls#inline-assembly

vlovich123 · 2025-03-08T18:29:53 1741458593

Assembly by definition is platform specific. The issue isn’t that it’s the same syntax on every platform but that it’s a single standardized syntax on each platform.

jmillikin · 2025-02-25T17:11:47 1740503507

Chicory seems like it'll be pretty useful. Java doesn't have easy access to the platform-specific security mechanisms (seccomp, etc) that are used by native tools to sandbox their plugins, so it's nice to have WebAssembly's well-designed security model in a pure-JVM library.

I've used it to experiment with using WebAssembly to extend the Bazel build system (which is written in Java). Currently there are several Bazel rulesets that need platform-specific helper binaries for things like parsing lock files or Cargo configs, and that's exactly the kind of logic that could happily move into a WebAssembly blob.

https://github.com/jmillikin/upstream__bazel/commits/repo-ru...

https://github.com/bazelbuild/bazel/discussions/23487

blacklion · 2025-02-25T21:59:58 1740520798

I don't understand logic and layers of abstraction here.

Chicory runs on JVM. Bazel runs on JVM. How inserting WebAssembly layer will help to eliminate platform-specific helper binaries? These binaries compiled to WebAssembly will be run, effectively, on JVM (through one additional layer of APIs provided by Chicory), right? Why you cannot write these helpers directly in JVM language, Java, Kotlin, Clojure, anything? Why do you need additional layer of Chicory?

andreaTP · 2025-02-25T22:15:10 1740521710

You don't, just, easily rewrite everything. Being able to just re-use is the trick!

ncruces · 2025-02-25T23:01:35 1740524495

Exactly.

Why would you rewrite (parts of) Cargo from Rust to something that runs on the JVM, when you can use Wasm as basically an intermediate target to compile the Rust down to JVM bytecode?

Or how about running something like Shellcheck (written in Haskell) on the JVM as part of a build process?

You can see the same idea for the Go ecosystem (taking advantage of the Go build system) on the many repos of this org: https://github.com/wasilibs

jcmfernandes · 2025-02-26T10:10:37 1740564637

This is great. The future is made of libraries packaged as WASM Components.

throwaway894345 · 2025-02-26T15:17:06 1740583026

Aren't WASM Components pretty constrained? My (very fuzzy) understanding is that they must basically manage all of their own memory, and they can only interact by passing around integer handles corresponding to objects they manage internally.

lukevp · 2025-02-26T16:10:44 1740586244

Part of the component model is codegen to build object structures in each language so that you can pass by reference objects that have an agreed upon shape.

Yes they each have their own linear memory, that’s one of the advantages of component model. It provides isolation at the library level and you don’t have to implicitly agree that each library gets the level of access your application does. It provides security against supply chain side attacks.

Having said that, component model isn’t supported by all runtimes and since its binding and code gen are static at compile time, it’s not useful for every situation. Think of it like a C FFI more than a web API receiving JSON, for example. Upgrading the library version would mean upgrading your bindings and rebuilding your app binary too, the two must move in lock-step.

blacklion · 2025-02-26T11:05:37 1740567937

Oh, these tools are written in languages which can be directly compiled to WebAssembly without any changes? Yes, then it make sense, thank you for clarification.

jmillikin · 2025-02-26T11:54:57 1740570897

Yeah, pretty much all of them are written in either Go or Rust. The Go tools pull in the Go standard library's Go parser to do things like compute dependencies via package imports, and the Rust ones use the Cargo libraries to parse Cargo.toml files.

From the perspective of a Bazel ruleset maintainer, precompiled helper tools are much easier to provide if your language has easy cross-compilation. So maybe one day Zig will start to make an appearance too.

pjmlp · 2025-02-26T08:36:40 1740559000

Java already has plenty of FFI variants for that.

jcmfernandes · 2025-02-26T10:07:07 1740564427

Yes, but WASM gives you more, especially WASM Components. E.g., FFI doesn't offer sandboxing, and unloading symbols is tricky. The WIT (WebAssembly Interface Types) IDL (+ bindings codegen) makes objects' exports explicit, but more importantly, their imports too (i.e., dependencies).

pjmlp · 2025-02-26T10:53:33 1740567213

Basically CORBA, DCOM, PDO, RMI, Jini and .NET Remoting for a new generation.

jmillikin · 2025-02-26T11:35:28 1740569728

None of what 'jcmfernandes lists are part of WebAssembly. At best they can be considered related technologies, like the relationship between the JVM and JavaBeans.

And in terms of design, they're closer to COM or OLE. The modern replacement for CORBA/DCOM/etc is HTTP+JSON (or gRPC), which doesn't try to abstract away the network.

pjmlp · 2025-02-26T12:08:45 1740571725

They are certainly not much different from WIT (WebAssembly Interface Types) IDL (+ bindings codegen).

jmillikin · 2025-02-26T13:01:42 1740574902

I've had the misfortune of working professionally with CORBA, and I've spent some time trying to keep up with WIT/WASI/that whole situation. Whatever WIT is going to be, I can assure you it's very different from CORBA.

The best way I think to describe WIT is that it seems to be an attempt to design a new ABI, similar to the System V ABI but capable of representing the full set of typesystems found in every modern language. Then they want to use that ABI to define a bunch of POSIX-ish syscalls, and then have WebAssembly as the instruction set for their architecture-independent executable format.

The good news is that WIT/WASI/etc is an independent project from WebAssembly, so whether it succeeds or fails doesn't have much impact on the use of WebAssembly as a plugin mechanism.

jcmfernandes · 2025-02-26T14:50:51 1740581451

Correct, they are a part of WASI. Indeed, different things, but well, tightly related. Made sense to talk about them given the chat on bridging gaps in bazel using WASM.

jcmfernandes · 2025-02-26T11:27:07 1740569227

Yes, the concept is old. I may be wrong, but to me, this really seems like it, the one that will succeed. With that said, I'm sure many said the same about the technologies you enumerated... so let's see!

gf000 · 2025-02-25T18:20:57 1740507657

I really don't want to sound flamewar-y, but how is WebAssmebly's security model well-designed compared to a pure Java implementation of a brainfuck interpreter? Similarly, java byte code is 100% safe if you just don't plug in filesystem/OS capabilities.

It's trivial to be secure when you are completely sealed off from everything. The "art of the deal" is making it safe while having many capabilities. If you add WASI to the picture it doesn't look all that safe, but I might just not be too knowledgeable about it.

kannanvijayan · 2025-02-25T19:11:17 1740510677

It's really difficult to compare the JVM and wasm because they are such different beasts with such different use cases.

What wasm brings to the table is that the core tech focuses on one problem: abstract sandboxed computation. The main advantage it brings is that it _doesn't_ carry all the baggage of a full fledged runtime environment with lots of implicit plumbing that touches the system.

This makes it flexible and applicable to situations where java never could be - incorporating pluggable bits of logic into high-frequency glue code.

Wasm + some DB API is a pure stored procedure compute abstraction that's client-specifiable and safe.

Wasm + a simple file API that assumes a single underlying file + a stream API that assumes a single outgoing stream, that's a beautiful piece of plumbing for an S3 like service that lets you dynamically process files on the server before downloading the post-processed data.

There are a ton of use cases where "X + pluggable sandboxed compute" is power-multiplier for the underlying X.

I don't think the future of wasm is going to be in the use case where we plumb a very classical system API onto it (although that use case will exist). The real applicability and reach of wasm is the fact that entire software architectures can be built around the notion of mobile code where the signature (i.e. external API that it requires to run) of the mobile code can be allowed to vary on a use-case basis.

wahern · 2025-02-25T20:50:39 1740516639

> What wasm brings to the table is that the core tech focuses on one problem: abstract sandboxed computation. The main advantage it brings is that it _doesn't_ carry all the baggage of a full fledged runtime environment with lots of implicit plumbing that touches the system.

Originally, but that's rapidly changing as people demand more performant host application interfacing. Sophisticated interfacing + GC + multithreading means WASM could (likely will) fall into the same trap as the JVM. For those too young to remember, Java Applet security failed not because the model was broken, but because the rich semantics and host interfacing opened the door to a parade of implementation bugs. "Memory safe" languages like Rust can't really help here, certainly not once you add JIT into the equation. There are ways to build JIT'd VMs that are amenable to correctness proofs, but it would require quite alot of effort and the most popular and performant VMs just aren't written with that architectural model in mind. The original premise behind WASM was to define VM semantics simple enough that that approach wouldn't be necessary to achieve correctness and security in practice; in particular, while leveraging existing JavaScript VM engines.

kannanvijayan · 2025-02-25T23:58:24 1740527904

The thing is, sophisticated interfacing, GC, and multithreading - assuming they're developed and deployed in a particular way - only apply in the cases where you're applying it to use cases that need those things. The core compute abstraction is still there and doesn't diminish in value.

I'm personally a bit skeptical of the approach to GC that's being taken in the official spec. It's very design-heavy and tries to bring in a structured heap model. When I was originally thinking of how GC would be approached on wasm, I imagined that it would be a few small hooks to allow the wasm runtime to track rooted pointers on the heap, and then some API to extract them when the VM decides to collect. The rest can be implemented "in userspace" as it were.

But that's the nice thing about wasm. The "roots-tracker" API can be built on plain wasm just fine. Or you can write your VM to use a shadow stack and handle everything internally.

The bigger issue isn't GC, but the ability to generate and inject wasm code that links into the existing program across efficient call paths - needed for efficient JIT compilation. That's harder to expose as a simple API because it involves introducing new control flow linkages to existing code.

hinkley · 2025-02-25T18:57:43 1740509863

The bespoke capability model in Java has always been so fiddly it has made me question the concept of capability models. There’s was for a long time a constant stream of new privilege escalations mostly caused by new functions being added that didn’t necessarily break the model themselves, but they returned objects that contained references to objects that contained references to data that the code shouldn’t have been able to see. Nobody to my recollection ever made an obvious back door but nonobvious ones were fairly common.

I don’t know where things are today because I don’t use Java anymore, but if you want to give some code access to a single file then you’re in good hands. If you want to keep them from exfiltrating data you might find yourself in an Eternal Vigilance situation, in which case you’ll have to keep on top of security fixes.

We did a whole RBAC system as a thin layer on top of JAAS. Once I figured out a better way to organize the config it wasn’t half bad. I still got too many questions about it, which is usually a sign of ergonomic problems that people aren’t knowledgeable enough to call you out on. But it was a shorter conversation with fewer frowns than the PoC my coworker left for me to productize.

bhelx · 2025-02-25T18:23:10 1740507790

WASI does open up some holes you should be considerate of. But it's still much safer than other implementations. We don't allow you direct access to the FS we use jimfs: https://github.com/google/jimfs

I typically recommend people don't allow wasm plugins to talk to the filesystem though, unless they really need to read some things from disk like a python interpreter. You don't usually need to.

worthless-trash · 2025-02-26T02:39:56 1740537596

I wouldn't say 100% safe. I was able to abuse the JVM to use spectre gadgets to find secret memory contents (aka private keys) on the JVM. It was tough but lets not overexagerate about JVM safety.

pjmlp · 2025-02-26T08:38:41 1740559121

You can have some fun with WebAssembly as well regarding spectre.

> Unfortunately, Spectre attacks can bypass Wasm's isolation guarantees. Swivel hardens Wasm against this class of attacks by ensuring that potentially malicious code can neither use Spectre attacks to break out of the Wasm sandbox nor coerce victim code—another Wasm client or the embedding process—to leak secret data.

https://www.usenix.org/conference/usenixsecurity21/presentat...

People have to stop putting WebAssembly in some pedestral of bytecode formats.

jmillikin · 2025-02-26T09:29:21 1740562161

WebAssembly doesn't have access to the high-resolution timers needed for Spectre attacks unless the host process intentionally grants that capability to the sandboxed code.

See this quote from the paper you linked:

""" Our attacks extend Google’s Safeside [24] suite and, like the Safeside POCs, rely on three low-level instructions: The rdtsc instruction to measure execution time, the clflush instruction to evict a particular cache line, and the mfence instruction to wait for pending memory operations to complete. While these instructions are not exposed to Wasm code by default, we expose these instructions to simplify our POCs. """

The security requirements of shared-core hosting that want to provide a full POSIX-style API are unrelated to the standard use of WebAssembly as an architecture-independent intermediate bytecode for application-specific plugins.

'gf000 correctly notes that WebAssembly's security properties are basically identical to any other interpreter, and there's many options for bytecodes (or scripting languages) that can do some sort of computation without any risk of a sandbox escape. WebAssembly is distinguished by being a good generic compilation target and being easy to write efficient interpreters/JITs for.

pjmlp · 2025-02-26T10:56:09 1740567369

WebAssembly doesn't exist in isolation, it needs host process to actually execute.

So whatever security considerations are to be taken from bytecode semantics, they are useless in practice, which keeps being forgotten by its advocates.

As they, and you point out, "WebAssembly's security properties are basically identical to any other interpreter,..."

The implementation makes all the difference.

jmillikin · 2025-02-26T11:41:49 1740570109

The WebAssembly bytecode semantics are important to security because they make it possible to (1) be a compilation target for low-level languages, and (2) implement small secure interpreters (or JITs) that run fast enough to be useful. That's why WebAssembly is being so widely implemented.

Java was on a path to do what WebAssembly is doing now, back in the '90s. Every machine had a JRE installed, every browser could run Java applets. But Java is so slow (and its sandboxing design so poor) that the world gave up on Java being able to deliver "compile once run anywhere".

If you want to take a second crack at Sun's vision, then you can go write your own embedded JVM and try to convince people to write an LLVM backend for it. The rest of us gave up on that idea when applets were removed from browsers for being a security risk.

pjmlp · 2025-02-26T12:07:46 1740571666

People talk all the time about Java, while forgeting such king of polyglot bytecodes exist since 1958, there are others that would be quite educating to learn about instead of always using Java as an example.

jmillikin · 2025-02-26T12:52:20 1740574340

Ok, show me a bytecode from the 60s (or 90s!) to which I can compile Rust or Go and then execute with near-native performance with a VM embedded in a standard native binary.

The old bytecodes of the 20th century were designed to be a compilation target for a single language (or family of closely-related languages). The bytecode for Erlang is different from that of Smalltalk is different from that of Pascal, and that's before you start getting into the more esoteric cases like embedded Forth.

The closest historical equivalent to today's JVM/CLR/WebAssembly I can think of is IBM's hardware-independent instruction set, which I don't think could be embedded and definitely wasn't portable to microcomputer architectures.

pjmlp · 2025-02-26T13:06:39 1740575199

The extent of how each bytecode was used doesn't invalidate their existence.

Any bytecode can be embedded, it is a matter of implementation.

> The Architecture Neutral Distribution Format (ANDF) in computing is a technology allowing common "shrink wrapped" binary application programs to be distributed for use on conformant Unix systems, translated to run on different underlying hardware platforms. ANDF was defined by the Open Software Foundation and was expected to be a "truly revolutionary technology that will significantly advance the cause of portability and open systems",[1] but it was never widely adopted.

https://en.wikipedia.org/wiki/Architecture_Neutral_Distribut...

> The ACK's notability stems from the fact that in the early 1980s it was one of the first portable compilation systems designed to support multiple source languages and target platforms

https://en.wikipedia.org/wiki/Amsterdam_Compiler_Kit

> More than 20 programming tools vendors offer some 26 programming languages — including C++, Perl, Python, Java, COBOL, RPG and Haskell — on .NET.

https://news.microsoft.com/2001/10/22/massive-industry-and-d...

Plenty more examples available to anyone that cares to dig what happened after UNCOL idea came to be in 1958.

Naturally one can always advocate that since 60 years of history have not provided that very special feature XYZ, we should now celebrate WebAssembly as the be all end all of bytecode, as startups with VC money repurpose old ideas newly wrapped.

jmillikin · 2025-02-26T14:04:11 1740578651

  > The extent of how each bytecode was used doesn't invalidate their existence.

It does, because uptake is the proof of suitability to purpose. There's no credit to just being first to think of an idea, only in being first to implement it well enough that everyone wants to use it.

  > Any bytecode can be embedded, it is a matter of implementation.

Empty sophistry is a poor substitute for thought. Are you going to post any evidence of your earlier claim, or just let it waft around like a fart in an elevator?

In particular, your reference to ANDF is absurd and makes me think you're having this discussion in bad faith. I remember ANDF, and TenDRA -- I lost a lot of hours fighting the TenDRA C compiler. Nobody with any familiarity with ANDF would put it in the same category as WebAssembly, or for that matter any other reasonable bytecode.

For anyone who's reading this thread, check out the patent (https://patents.google.com/patent/EP0464526A2/en) and you'll understand quickly that ANDF is closer to a blend of LLVM IR and Haskell's Cmm. It's designed to be used as part of a multi-stage compiler, where part of the compiler frontend runs on the developer system (emitting ANDF) and the rest of the frontend + the whole backend + the linker runs on the target system. No relationship to WebAssembly, JVM bytecode, or any other form of bytecode designed to be executed as-is with predictable platform-independent semantics.

  > More than 20 programming tools vendors offer some 26 programming languages
  > — including C++, Perl, Python, Java, COBOL, RPG and Haskell — on .NET.

I want to see you explain why you think the CLR pre-dates the JVM. Or explain why you think C++/CLI is the same as compiling actual standard C/C++ to WebAssembly.

  > Naturally one can always advocate that since 60 years of history have not 
  > provided that very special feature XYZ, we should now celebrate WebAssembly
  > as the be all end all of bytecode, as startups with VC money repurpose old
  > ideas newly wrapped.

Yes, it is in fact normal to celebrate when advances in compiler implementation, security research and hardware performance enable a new technology that solves many problems without any of the downsides that affected previous attempts in the same topic.

If you reflexively dislike any technology that is adopted by startups, and then start confabulating nonsense to justify your position despite all evidence, then the technology isn't the problem.

pjmlp · 2025-02-26T16:18:34 1740586714

> It does, because uptake is the proof of suitability to purpose. There's no credit to just being first to think of an idea, only in being first to implement it well enough that everyone wants to use it.

Depends on how the sales pitch of those selling the new stack goes.

> Empty sophistry is a poor substitute for thought. Are you going to post any evidence of your earlier claim, or just let it waft around like a fart in an elevator?

Creative writing, some USENET flavour, loving it.

> In particular, your reference to ANDF is absurd and makes me think you're having this discussion in bad faith. I remember ANDF, and TenDRA -- I lost a lot of hours fighting the TenDRA C compiler. Nobody with any familiarity with ANDF would put it in the same category as WebAssembly, or for that matter any other reasonable bytecode.

It is a matter of prior art, not what they achieved in practice.

> I want to see you explain why you think the CLR pre-dates the JVM. Or explain why you think C++/CLI is the same as compiling actual standard C/C++ to WebAssembly.

I never written that the CLR predates the JVM, where is that can you please point us out?

C++/CLI is as standard C and C++, as using emscripten clang extensions for WebAssembly integration with JavaScript.

But I tend to forget at the eyes of FOSS folks, clang and GCC language extensions are considered regular C and C++, as if defined by ISO themselves.

> Yes, it is in fact normal to celebrate when advances in compiler implementation, security research and hardware performance enable a new technology that solves many problems without any of the downsides that affected previous attempts in the same topic.

Naturally, when folks are honest about the actual capabilities and the past they build upon.

I love WebAssembly Kubernetes clusters reinventing application servers, by the way, what a cool idea!

worthless-trash · 2025-03-01T11:15:16 1740827716

I love that i can get this kind of depth of conversation in hn.

pjmlp · 2025-02-25T18:31:34 1740508294

Pssst, it is the usual WebAssembly sales pitch.

Linear memory accesses aren't bound checked inside the linear memory segment, thus data can still be corrupted, even if it doesn't leave the sandbox.

Also just like many other bytecode based implementations, it is as safe as the implementations, that can be equally attacked.

https://webassembly.org/docs/security/

https://www.usenix.org/conference/usenixsecurity20/presentat...

https://www.usenix.org/conference/usenixsecurity21/presentat...

https://www.usenix.org/conference/usenixsecurity22/presentat...

AgentME · 2025-02-25T21:58:11 1740520691

WebAssembly being described as a sandbox is perfectly valid. Applications with embedded sandboxes for plugins use the sandbox to protect the application from the plugin, not to protect the plugin from itself. The plugin author can protect the plugin from itself by using a memory-safe language that compiles to WebAssembly; that's on them and not on the embedding application.

pjmlp · 2025-02-26T06:30:01 1740551401

Except the tiny detail that the whole application is responsible for everything it does, including the behaviour of plugins it decides to use, so if the plugin can be exposed to faulty behaviour on its outputs, that will influence the expected behaviour from the host with logic building on those outputs, someone will be very happy and write a blog post with a funny name.

andreaTP · 2025-02-25T18:45:58 1740509158

Looking forward to seeing more Chicory in Bazel, is a great use-case! Thanks for spearheading it!

throwaway894345 · 2025-02-26T15:13:21 1740582801

> Java doesn't have easy access to the platform-specific security mechanisms (seccomp, etc) that are used by native tools to sandbox their plugins, so it's nice to have WebAssembly's well-designed security model in a pure-JVM library.

I thought Java had all of this sandboxing stuff baked in? Wasn't that a big selling point for the JVM once upon a time? Every other WASM thread has someone talking about how WASM is unnecessary because JVM exists, so the idea that JVM actually needs WASM to do sandboxing seems pretty surprising!

jmillikin · 2025-02-26T23:30:45 1740612645

The JVM was designed with the intention of being a secure sandbox, and a lot of its early adoption was as Java applets that ran untrusted code in a browser context. It was a serious attempt by smart people to achieve a goal very similar to that of WebAssembly.

Unfortunately Java was designed in the 1990s, when there was much less knowledge about software security -- especially sandboxing of untrusted code. So even though the goal was the same, Java's design had some flaws that made it difficult to write a secure JVM.

The biggest flaw (IMO) was that the sandbox layer was internal to the VM: in modern thought the VM is the security boundary, but the JVM allows trusted and untrusted code to execute in the same VM, with java.lang.SecurityManager[0] and friends as the security mechanism. So the attack surface isn't the bytecode interpreter or JIT, it's the entire Java standard library plus every third-party module that's linked in or loaded.

During the 2000s and 2010s there were a lot of Java sandbox escape CVEs. A representative example is <https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-0422>. Basically the Java security model was broken, but fixing it would break backwards compatibility in a major way.

--

Around the same time (early-mid 2010s) there more thought being put into sandboxing native code, and the general consensus was:

- Sandboxing code within the same process space requires an extremely restricted API. The original seccomp only allowed read(), write(), exit(), and sigreturn() -- it could be used for distributed computation, but compiling existing libraries into a seccomp-compatible dylib was basically impossible.

- The newly-developed virtualization instructions in modern hardware made it practical to run a virtual x86 machine for each untrusted process. The security properties of VMs are great, but the x86 instruction set has some properties that make it difficult to verify and JIT-compile, so actually sitting down and writing a secure VM was still a major work of engineering (see: QEMU, VMWare, VirtualBox, and Firecracker).

Smartphones were the first widespread adoption of non-x86 architectures among consumers since PowerPC, and every smartphone had a modern web browser built in. There was increasing desire to have something better than JavaScript for writing complex web applications executing in a power-constrained device. Java would have been the obvious choice (this was pre-Oracle), except for the sandbox escape problem.

WebAssembly combines architecture-independent bytecode (like JVM) with the security model of VMs (flat memory space, all code in VM untrusted). So you can take a whole blob of legacy C code, compile it to WebAssembly, and run it in a VM that runs with reasonable performance on any architecture (x86, ARM, RISC-V, MIPS, ...).

throwaway894345 · 2025-02-27T20:01:04 1740686464

What a thorough, excellent response. I learned a lot, thank you for taking the time to write this up!

jmillikin · 2025-02-18T09:11:58 1739869918

> the rationalist community grew around publications by Eliezer Yudkowsky

Rationalism (in its current form) has been around since long before someone on the internet became famous for their epic-length Harry Potter fanfiction, and it will continue to exist long after LessWrong has become a domain parking page.

hollerith · 2025-02-18T13:26:18 1739885178

Sure, but currently we are discussing (inaccurate portrayals of) the community that grew starting in 2006 around Eliezer's writings. I regret that there is no better name for this community. (The community has tried to acquire a more descriptive name, but none have stuck.)

lupusreal · 2025-02-18T11:22:29 1739877749

The LessWrong affiliated "Rationalists" are to lower-case rationalism as the "People's Democratic Republic of Korea" are to democracy.

defrost · 2025-02-18T11:34:58 1739878498

Please refrain from belittling the PDRoK with such comparisons, they do produce some extraordinary accordian players after all.

lupusreal · 2025-02-18T11:58:00 1739879880

I love the Moranbong Band.

jmillikin · 2025-02-18T08:58:41 1739869121

The trouble with PG&E is that it's trying to serve two incompatible goals.

The shareholders want it to provide electric service for a profit in the locales where doing so is economically sensible (= urban/suburban), slowly grow its value, and throw off a stable stream of dividends. This is the basic value proposition of all for-profit utilities: low growth, low volatility, stable income.

The state government -- and a not insubstantial proportion of the state population -- want PG&E to be a non-profit that provides electricity at cost to everyone in its coverage area, which is to include huge swaths of forest-covered hillsides and dry rural scrubland. Every time it gets mentioned on HN (not exactly a hotbed of communism!) there's a bunch of comments about how it should be illegal for an electric utility to have any profit at all.

PG&E can't have it both ways. It hasn't paid a non-trivial dividend since 2017 and its share price is ~half of what it was 20 years ago, which makes it an astonishingly poor investment -- compare to Southern Company (SO) or Duke Energy (DUK). But at the same time it is legally mandated to absorb the costs of operating high-voltage lines in brushfire territory, and half its customers think it shouldn't be allowed to exist.