Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I wonder where we’d be if the idea of CPU-independent bytecode had ever really taken off [...]. You could have an AOT compiler in the system firmware which the OS invokes [...].

You could say modern CPUs are kind of like tracing JITs. On one hand, a normal tracing JIT has much more memory to save its work than a CPU’s trace cache, but on the other, the superscalar reordering and renaming stuff is even more aggressive than a trace recorder about looking at how the code actually executes and deriving assumptions from that instead of attempting to prove them statically.

Why not AOT instead? In part because they can’t, of course—a tracing JIT requires about the least amount of heavyweight compiler tech out of all the possibilities, which is an advantage if you’re trying to fit the compiler into silicon. (That’s not to say a tracing JIT is easy—the cost of a simple compiler is that you need to make it hella fast for the result to be any good.)

But in part I suspect it’s because a standard assembly-level bytecode kind of sucks to compile ahead of time. About the most useful assumptions such a compiler can make is which things don’t interfere with others, usually memory operations, or perhaps which writes can be forwarded to reads. A tracing JIT can see some of this, a superscalar even more so; an AOT or function-at-a-time JIT, in the absence of any aliasing information or even knowing when one object ends and another begins (boo WebAssembly), can’t.

Ironically, memory segmentation as in the Intel 432 or 286 (or the IBM dinosaurs) feels like could help with that (or are we calling this idea “capability-based” once again?). Does anyone who isn’t just a speculating dilettante (unlike me) think that’s a reasonable thought?

(Wait, is a selector table just a Smalltalk-style object table with a fake moustache?)

Of course, even then we’d still have the problem that VLIW microcode wide enough to require no decoding and engage the entirety of a modern CPU’s physical register file and execution units would be cripplingly slow to fetch from DRAM, and the “legacy” ISAs partly serve a compression format.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: