This is simply not true. You're massively underestimating the overhead of a bytecode VM. Even the most optimal bytecode VMs are easily beaten by the simplest of JITs in 100 lines of C: https://arxiv.org/pdf/1604.01290.pdf
Code doesn't even need to be "hot" to make it worth it. WebKit switches from interpreter to cheap baseline compilation, without an specular optimizations or type information, after only 6 calls of a function: https://webkit.org/blog/3362/introducing-the-webkit-ftl-jit/
This article literally describes how baseline JIT is worth it simply to remove the bytecode VM dispatch overhead.
It's not clear to me why you think that the linked article shows that naive JITs perform better across-the-board than well-written bytecode interpreters. The reported speed improvement from JIT itself is a mere 2.3x, and the listed benchmarks mostly involve numerical code, where JIT tends to be effective.
A simple JIT can get you to the point where you reliably outperform a bytecode interpreter for certain types of code. What takes a lot more engineering effort is reliably performing at least as fast as a bytecode VM for all types of code.
MJIT is completely different to PyPy. Their problems are simply not relevant to MJIT. MJIT is already just as fast as standard CRuby when executing complex Rails code, minus the small overhead for JIT compilation: http://engineering.appfolio.com/appfolio-engineering/2018/3/...
PyPy is a much more ambitious design, completely replacing CPython, and using an unusual JIT scheme of tracing the interpreter itself and trying to produce an interpreter optimized for particular traces of your code.
It was much harder for that approach to reach the same level of general performance than it seems to have been for CRuby & MJIT.
Code doesn't even need to be "hot" to make it worth it. WebKit switches from interpreter to cheap baseline compilation, without an specular optimizations or type information, after only 6 calls of a function: https://webkit.org/blog/3362/introducing-the-webkit-ftl-jit/
This article literally describes how baseline JIT is worth it simply to remove the bytecode VM dispatch overhead.