It's interesting to read the OP's response to partial evaluation (employed by Truffle) and why he switched to meta-tracing in RPython. Basically, the inlining/specialization decisions in PE-based JITs were harder to control than just letting the VM observe loop iterations.
Here's a paper regarding a pair of language implementors' experience using both techniques:
By changing the abstract syntax tree during execution, type-generic nodes in the AST can be replaced with type-specialized nodes. This carries more overhead than tracing through the interpreter. It also requires more explicit work from the user to achieve optimized results.
The authors of the above study conclude that the resulting performance was similar, but that meta-tracing was an easier technique to use than PE. This agrees with the OP's assessment.
Not entirely related, but it's wild to see projects that I remember the announcement of (I was working in a Python shop at the time) turning 15+ years old. I'm not sure I'm even past thinking of PyPy as kinda newfangled (that's just the bucket it got dumped into in my head: cool new Python compiler/VM thing). Probably time to re-categorize it in my brain as something more like "time-tested" or something.
I migrated some internal tools from Lua to Python because of the incredible ecosystem. However, I miss everyday the speed of Lua thanks to LuaJIT. But I already ported almost all critical code to Cython. It is on my todo list to undo this port and test with Pypy.
I feel that PyPy need to receive much more love and attention than it actually gets.
I already donated, and put all my hopes on PyPy. The Windows is also very renegaded, where the multiprocessing module is unusable. I hope to get some time soon to help this incredible project and finish lib_pypy/_winapi.py.
The Julia compiler is a little bit different in the sense that it actually compiles ahead of time (it is not a tracing JIT). This takes place just before a method (with known runtime types for the arguments) is executed for the first time. So performance is predictable but on the other hand there is a risk of over specializations.
I see Go (also) occupying that niche. Maybe that's because of my (cloud technology) bubble, but 90%+ of the interesting new projects are in Go (not just web services or container runtimes, also CLI tools).
That has always been my reason for not giving a Cython a real chance. I'm afraid I would go through the trouble porting everything, then have a use case where I need to just run pure python.
I've considered having a compat layer where you import dummy versions of all the classes and decorators if cython isn't present. Has anyone ever tried that?
I don't know about your question, but Cython recently added [1] the option to use native Python type annotations (Python 3.6, PEP 484) instead of its previous C-like dialect. That could lead to a situation where a sensibly type-annotated Cython file can still be a valid Python file – that would help with portability.
Cython also supports writing type annotations in a separate .pxd file. It's a bit awkward because you're effectively duplicating all function + variable declarations, but it _is_ a valid way to keep the original Python source intact.
(On a side note, using this feature led to a particularly memorable debugging session that set a personal record in terms of time spent vs amount of code changed to fix the issue: https://news.ycombinator.com/item?id=11115110 ).
It's so fascinating to read these sorts of retrospectives to give myself a grounding to realize it takes time and lots of missteps to make great software. Thanks to the author, if he passes through here.
For some reason, Firefox reader view removes all section headings and some random paragraphs. Made for a confusing read until I went back and reread it.
On the other hand, I think it is somewhat deterimental to the success of the idea of meta-tracing that really the only working production grade implementation of the idea is RPython. RPython is very idiosyncratic, to say the least. I think a good implementation of meta-tracing on boring technology (say, C++) would greatly help popularize the idea.
I'm just sad compatibility with data science tools (and scientific computing) isn't there afaik, numpy, pandas, scipy, tensorflow, scikit-learn, pytorch to name a few
Here's a paper regarding a pair of language implementors' experience using both techniques:
http://stefan-marr.de/papers/oopsla-marr-ducasse-meta-tracin...
By changing the abstract syntax tree during execution, type-generic nodes in the AST can be replaced with type-specialized nodes. This carries more overhead than tracing through the interpreter. It also requires more explicit work from the user to achieve optimized results.
The authors of the above study conclude that the resulting performance was similar, but that meta-tracing was an easier technique to use than PE. This agrees with the OP's assessment.