The First 15 Years of PyPy – A Personal Retrospective

chrisaycock · on Sept 9, 2018

It's interesting to read the OP's response to partial evaluation (employed by Truffle) and why he switched to meta-tracing in RPython. Basically, the inlining/specialization decisions in PE-based JITs were harder to control than just letting the VM observe loop iterations.

Here's a paper regarding a pair of language implementors' experience using both techniques:

http://stefan-marr.de/papers/oopsla-marr-ducasse-meta-tracin...

By changing the abstract syntax tree during execution, type-generic nodes in the AST can be replaced with type-specialized nodes. This carries more overhead than tracing through the interpreter. It also requires more explicit work from the user to achieve optimized results.

The authors of the above study conclude that the resulting performance was similar, but that meta-tracing was an easier technique to use than PE. This agrees with the OP's assessment.

SwellJoe · on Sept 10, 2018

Not entirely related, but it's wild to see projects that I remember the announcement of (I was working in a Python shop at the time) turning 15+ years old. I'm not sure I'm even past thinking of PyPy as kinda newfangled (that's just the bucket it got dumped into in my head: cool new Python compiler/VM thing). Probably time to re-categorize it in my brain as something more like "time-tested" or something.

bratao · on Sept 9, 2018

I migrated some internal tools from Lua to Python because of the incredible ecosystem. However, I miss everyday the speed of Lua thanks to LuaJIT. But I already ported almost all critical code to Cython. It is on my todo list to undo this port and test with Pypy.

I feel that PyPy need to receive much more love and attention than it actually gets. I already donated, and put all my hopes on PyPy. The Windows is also very renegaded, where the multiprocessing module is unusable. I hope to get some time soon to help this incredible project and finish lib_pypy/_winapi.py.

pjmlp · on Sept 10, 2018

I share the same perception of lack of love from the community regarding JIT compilers, not only PyPy.

I see Julia and JavaScript eventually wining over those that can't be bothered to deal with C for the extra performance step.

kristofferc · on Sept 10, 2018

The Julia compiler is a little bit different in the sense that it actually compiles ahead of time (it is not a tracing JIT). This takes place just before a method (with known runtime types for the arguments) is executed for the first time. So performance is predictable but on the other hand there is a risk of over specializations.

pjmlp · on Sept 10, 2018

I know, but that is not the point.

The point is the lack of adoption of JIT runtimes across the Python community at large.

majewsky · on Sept 10, 2018

I see Go (also) occupying that niche. Maybe that's because of my (cloud technology) bubble, but 90%+ of the interesting new projects are in Go (not just web services or container runtimes, also CLI tools).

pjmlp · on Sept 10, 2018

Yes, there were a couple of talks at Go UK 2018 about that.

bear24rw · on Sept 10, 2018

What type of performance issues do you run into with cpython?

stuaxo · on Sept 10, 2018

Not the OP, but personally I'd like to be able to use python for all the things that people say "just don't use python for".

Including things like modifying every pixel of an image, and other things that need to be fast-ish.

pletnes · on Sept 10, 2018

Did you try numba yet?

dec0dedab0de · on Sept 10, 2018

It is on my todo list to undo this port

That has always been my reason for not giving a Cython a real chance. I'm afraid I would go through the trouble porting everything, then have a use case where I need to just run pure python.

I've considered having a compat layer where you import dummy versions of all the classes and decorators if cython isn't present. Has anyone ever tried that?

uryga · on Sept 10, 2018

I don't know about your question, but Cython recently added [1] the option to use native Python type annotations (Python 3.6, PEP 484) instead of its previous C-like dialect. That could lead to a situation where a sensibly type-annotated Cython file can still be a valid Python file – that would help with portability.

[1] https://github.com/cython/cython/issues/1672#issuecomment-33...

acemarke · on Sept 10, 2018

Cython also supports writing type annotations in a separate .pxd file. It's a bit awkward because you're effectively duplicating all function + variable declarations, but it _is_ a valid way to keep the original Python source intact.

(On a side note, using this feature led to a particularly memorable debugging session that set a personal record in terms of time spent vs amount of code changed to fix the issue: https://news.ycombinator.com/item?id=11115110 ).

maxerickson · on Sept 10, 2018

There's a built in pattern that is at least similar to what you describe:

https://cython.readthedocs.io/en/latest/src/tutorial/pure.ht...

In the standard library, they maintain both Python and C versions of some modules:

https://www.python.org/dev/peps/pep-0399/

bratao · on Sept 10, 2018

Yeah, using Cython feel dirty because of that. I miss the opportunity to step into during debug and have to use workarounds to do profiling.

chris_mc · on Sept 10, 2018

It's so fascinating to read these sorts of retrospectives to give myself a grounding to realize it takes time and lots of missteps to make great software. Thanks to the author, if he passes through here.

yazr · on Sept 10, 2018

Slightly off topic.

What is the accepted/common runtime for production python code ?

Is it cpython, ironpython ? Is the plain python interpreter fast enough ?

(I am referring to cases where python is not just a thin wrapper around C libs)

dualscyther · on Sept 10, 2018

For some reason, Firefox reader view removes all section headings and some random paragraphs. Made for a confusing read until I went back and reread it.

pletnes · on Sept 10, 2018

How do people use pypy? Do you target it from the start of a project, or port existing projects to it?

UncleEntity · on Sept 9, 2018

Bunch of dead links to bitbucket...

But now I have another rabbit hole to deep dive down in the form of meta-tracing JITs.

ufo · on Sept 9, 2018

Has Carl always had the "-Tereick" in his name or did he get married recently?

sp332 · on Sept 10, 2018

http://cfbolz.de/contact.html just mentions that he changed it last year but doesn't give a reason.

fniephaus · on Sept 10, 2018

yes, he got married a while ago.

sanxiyn · on Sept 10, 2018

Yes, meta-tracing really works!

On the other hand, I think it is somewhat deterimental to the success of the idea of meta-tracing that really the only working production grade implementation of the idea is RPython. RPython is very idiosyncratic, to say the least. I think a good implementation of meta-tracing on boring technology (say, C++) would greatly help popularize the idea.

make3 · on Sept 10, 2018

I'm just sad compatibility with data science tools (and scientific computing) isn't there afaik, numpy, pandas, scipy, tensorflow, scikit-learn, pytorch to name a few

namedlambda · on Sept 10, 2018

Reread it then, PyPy has implemented emulation layer for the CPython which allows numpy, etc to be used.