When folks talk about the performance of Julia's JIT, they're generally talking ...

When folks talk about the performance of Julia's JIT, they're generally talking about the compile-time overhead (and not the runtime speed). Type inference is one part of the compile-time overhead that enables improved runtime speed... but I think you're actually curious about runtime speed.

Julia aggressively compiles multiple specialized versions of almost all the methods it encounters — one specialization for each unique combination of the argument types you pass. Even if you only define a single method `f(x,y) = 3x+2y`, Julia will compile a specialized floating point implementation when you call `f(2.5, 3.5)`, an integer implementation when you call `f(1, 2)`, and so on. It's this very aggressive compilation of everything many times over that makes Julia infamously slow to start and fast to run.

Inside each of these specializations, Julia concretely knows the type of `x` and `y`. And so it can — while compiling it — look up exactly which multiplication method it should use to compute `3x` and `2y`, inline them, infer the types of the results, lookup which `+` method to call, and inline it.

Even if you end up calling bigger functions that don't inline, Julia can hard-code the pointer to the exact specialization of every function you call because it's in a context where it knows the exact types of everything.

So it's a bit of a chicken-and-egg question. Julia's JIT (typically) compiles specializations of methods with precisely known argument types. This makes does make inference easier: every function call is a fresh start! If at any point inference loses the trail, it's ok, Julia just compiles it pessimistically to handle any type and can lookup the exact method/specialization that's needed for each function call on-demand (potentially compiling it and getting a new fresh start if needed), and then you're back on the happy well-inferred and super-specialized path.