The issue is not AOT vs JIT, but one of language semantics and levels of abstrac...

The issue is not AOT vs JIT, but one of language semantics and levels of abstraction.

You could prove this point in a trivial way by JIT compiling C with LLVM. The binary would have an increased startup cost, but would otherwise run identically to an AOT compiled C program. It might even run faster, since the LLVM JIT could run with the equivalent of the -march=native to the AOT compiler. (In an AOT environment, you can only do this if you can guarantee that the binary will only run on the machine it was compiled on, or otherwise know the exact specs of the target machine.)

A much more interesting example is Terra[1], a language with C-like semantics embedded inside Lua. The Terra-Lua combination allows you to use Lua as a sort of C++ templates replacement, giving you much better metaprogramming than traditional C/C++. This allows you to do cool things like implement a version of matrix-multiply which is specialized to the target hardware, allowing you to match or almost match the performance of ATLAS and Intel MKL. Mind you: these are not your usual C/C++ programs, they are mostly written in manually hand-tuned (or auto-tuned) assembly.

What we're really looking at here is a difference in abstraction levels. It's not that the JVM JIT is incapable of getting C-like performance. Obviously it can, in certain cases. The issue is that the Java semantics prevent the JVM from getting C-like performance in a predictable and dependable way from idiomatic Java programs. This should not really be a surprise, given how far Java is from C.

What we really need to do is take a step back and reconsider our assumptions. Our modern programming languages lock us into certain assumptions about language design and semantics that can be blinding at times. Consider that even C doesn't get optimal performance in all cases---this is why Fortran still exists. What would it take to get to programming language with nice semantics and truly optimal performance? I don't have a closed form answer, but I'm confident that the solution will require revisiting some of the assumptions baked into not just our compiler infrastructures, but also our languages.

[1]: http://terralang.org/