On Apple silicon, there's a fun new twist: the instruction (I) and data (D) caches are separate, so you have to remember to call sys_icache_flush [1] to keep them in sync!
If you don't handle this correctly, then you'll have a Very Fun time [2].
What's especially Fun is that if you halt your program in a debugger and read back the contents of the JIT buffer, it will look perfectly correct – because those reads are coming through RAM / DCache, not ICache!
Arm, rather, and many other architectures besides. Even x86 does not guarantee complete I/D cache coherency (although its guarantees are much stronger than arm's).
x86? Yes, coherence is basically at the level of traces. If you modify some code and then jump to it, you're guaranteed to get the new code; but if you modify some code within a currently running trace, the results are unspecified.
The term "dynamic compilation" was popular in the Smalltalk community due to the 1984 paper by Deutsch and Shiffman that described what became the ParcPlace Smalltalk.
In Self 1 and 2 we also used dynamic compilation and we got adaptive compilation in Self 3. This moved from Sun to Animorphic for the Strongtalk project and got bought by Sun to become the Java HotSpot technology.
JIT was very popular among Japanese car manufacturers and the term was borrowed for Java. Note that Java only had simple interpreters in its first few years so we are talking the late 1990s here.
A bit off topic, Eli Bendersky's entire blog is absolutely amazing. I started writing a technical blog inspired by his. Most of his recent blogs seem to be about Go, which I don't know much about. But I found myself going back to some of his older blog posts many times.
> Whenever a program, while running, creates and runs some new executable code which was not part of the program when it was stored on disk, it’s a JIT.
By this definition, the Emacs native compiler would be a jit, even though it is considered an AoT compiler.
I think there needs to be another caveat, where a jit uses some run time information to generate the machine code.
It makes me wonder if there are any JIT compilers that generate code based on runtime data, but not at runtime? Similar to profile guided optimizations in static languages. Basically collect your traces and runtime info, then generate the code offline. This would let you take more time to optimize, and you could save the new code to disk and reuse it.
By your definition, webassembly doesn't use a JIT, because it doesn't use any runtime information.
In practice, the JIT/AOT distinction boils down to whether compilation time counts as part of execution time, or not. If cycles you spend optimizing your code are cycles you don't spend running it, you make different design decisions.
(Note also that the Emacs native compiler is built using libgccjit.)
I believe that Graalvm has the ability to take an optimization snapshot at runtime to both build an initial boot pre-optimized mode but I think this is also how their native image is optimized. I'm not entirely sure if the Graalvm native images run a JIT.
I first learned about this in high school from "A Basic Just-In-Time Compiler"[1]. It was mind-blowing that you could just cast some data storing machine code to a function pointer and execute it. Trying the program now it seems GCC no longer accepts it with -pedantic-errors because casts between object and function pointers are, according to the standard, implementation-defined.
It is a silly term. It is “on-demand compilation”, with the latency that entails. “Just-in-time compilation” would be finished right when you need to execute the code.
One underrated way to jit is to write some code, spawn a compiler, then link to a shared object. Takes 2 mins to write and sidesteps a lot of complexity
An example is Ruby MJIT's: https://blog.heroku.com/ruby-mjit. I'm not a big fan of this strategy - as a matter of fact, Ruby's future (reference) JIT is a conventional JIT.
So, the article is actually about running the code you've JIT-compiled and have in memory - not about the actual JIT-compilation. But ok, that (easier) part is useful to go over too.
My dream is to create an efficient multithreaded runtime that JITs. I came across this article for creating an AST in C and simple machine code generation.
Erlang is multithreaded and JITs for some platforms, if you're not going to true scottsman Erlang JIT because it's not using runtime information. The main goal of Erlang JIT is to eliminate interpretation overhead, rather than any of the deeper analysis goals of some other popular JITs.
Libjit is fun to play around with, I wrote a python wrapper around it, but the more you play the more you realize how much you need to know to truly use it in anger.
Those were the days before wasm and I didn’t really have any concrete plans (some cockamamie blender related idea I’m sure) so never did anything with it. Probably be fairly trivial to make a wasm front end for it though.
Worth noting that the 2-phase split isn’t really there in JITs that have been optimized to the limit because they tend to do some amount of self-modifying code and to make that great, they interleave the compiler IR with runtime data structures. Not saying it has to be that way, only that it often is, for good reasons.
I never quite understood what the advantage of a JIT runtime is. A RegEx expression that gets complied to machine code but is specified at runtime makes sense, but as a general language paradigm, why is this useful compared to ahead of time compilation code that is known ahead of time? Is there some efficiency to be had from doing JIT vs not doing it and some heuristic you can deduce to find hot loops and optimize them as the program runs?
Jits are generally used for dynamic languages, where every type needs to be checked and untagged. But a jit compiler can observe the actual types used during runtime and specialize the code for that. Allows them to perform closer to static languages.
If you don't handle this correctly, then you'll have a Very Fun time [2].
What's especially Fun is that if you halt your program in a debugger and read back the contents of the JIT buffer, it will look perfectly correct – because those reads are coming through RAM / DCache, not ICache!
[1] https://developer.apple.com/library/archive/documentation/Sy... [2] https://twitter.com/impraxical/status/1579586601117548544