Hacker News new | past | comments | ask | show | jobs | submit login
The First 15 Years of PyPy – A Personal Retrospective (morepypy.blogspot.com)
183 points by gok on Sept 9, 2018 | hide | past | favorite | 27 comments



It's interesting to read the OP's response to partial evaluation (employed by Truffle) and why he switched to meta-tracing in RPython. Basically, the inlining/specialization decisions in PE-based JITs were harder to control than just letting the VM observe loop iterations.

Here's a paper regarding a pair of language implementors' experience using both techniques:

http://stefan-marr.de/papers/oopsla-marr-ducasse-meta-tracin...

By changing the abstract syntax tree during execution, type-generic nodes in the AST can be replaced with type-specialized nodes. This carries more overhead than tracing through the interpreter. It also requires more explicit work from the user to achieve optimized results.

The authors of the above study conclude that the resulting performance was similar, but that meta-tracing was an easier technique to use than PE. This agrees with the OP's assessment.


Not entirely related, but it's wild to see projects that I remember the announcement of (I was working in a Python shop at the time) turning 15+ years old. I'm not sure I'm even past thinking of PyPy as kinda newfangled (that's just the bucket it got dumped into in my head: cool new Python compiler/VM thing). Probably time to re-categorize it in my brain as something more like "time-tested" or something.


I migrated some internal tools from Lua to Python because of the incredible ecosystem. However, I miss everyday the speed of Lua thanks to LuaJIT. But I already ported almost all critical code to Cython. It is on my todo list to undo this port and test with Pypy.

I feel that PyPy need to receive much more love and attention than it actually gets. I already donated, and put all my hopes on PyPy. The Windows is also very renegaded, where the multiprocessing module is unusable. I hope to get some time soon to help this incredible project and finish lib_pypy/_winapi.py.


I share the same perception of lack of love from the community regarding JIT compilers, not only PyPy.

I see Julia and JavaScript eventually wining over those that can't be bothered to deal with C for the extra performance step.


The Julia compiler is a little bit different in the sense that it actually compiles ahead of time (it is not a tracing JIT). This takes place just before a method (with known runtime types for the arguments) is executed for the first time. So performance is predictable but on the other hand there is a risk of over specializations.


I know, but that is not the point.

The point is the lack of adoption of JIT runtimes across the Python community at large.


I see Go (also) occupying that niche. Maybe that's because of my (cloud technology) bubble, but 90%+ of the interesting new projects are in Go (not just web services or container runtimes, also CLI tools).


Yes, there were a couple of talks at Go UK 2018 about that.


What type of performance issues do you run into with cpython?


Not the OP, but personally I'd like to be able to use python for all the things that people say "just don't use python for".

Including things like modifying every pixel of an image, and other things that need to be fast-ish.


Did you try numba yet?


It is on my todo list to undo this port

That has always been my reason for not giving a Cython a real chance. I'm afraid I would go through the trouble porting everything, then have a use case where I need to just run pure python.

I've considered having a compat layer where you import dummy versions of all the classes and decorators if cython isn't present. Has anyone ever tried that?


I don't know about your question, but Cython recently added [1] the option to use native Python type annotations (Python 3.6, PEP 484) instead of its previous C-like dialect. That could lead to a situation where a sensibly type-annotated Cython file can still be a valid Python file – that would help with portability.

[1] https://github.com/cython/cython/issues/1672#issuecomment-33...


Cython also supports writing type annotations in a separate .pxd file. It's a bit awkward because you're effectively duplicating all function + variable declarations, but it _is_ a valid way to keep the original Python source intact.

(On a side note, using this feature led to a particularly memorable debugging session that set a personal record in terms of time spent vs amount of code changed to fix the issue: https://news.ycombinator.com/item?id=11115110 ).


There's a built in pattern that is at least similar to what you describe:

https://cython.readthedocs.io/en/latest/src/tutorial/pure.ht...

In the standard library, they maintain both Python and C versions of some modules:

https://www.python.org/dev/peps/pep-0399/


Yeah, using Cython feel dirty because of that. I miss the opportunity to step into during debug and have to use workarounds to do profiling.


It's so fascinating to read these sorts of retrospectives to give myself a grounding to realize it takes time and lots of missteps to make great software. Thanks to the author, if he passes through here.


Slightly off topic.

What is the accepted/common runtime for production python code ?

Is it cpython, ironpython ? Is the plain python interpreter fast enough ?

(I am referring to cases where python is not just a thin wrapper around C libs)


For some reason, Firefox reader view removes all section headings and some random paragraphs. Made for a confusing read until I went back and reread it.


How do people use pypy? Do you target it from the start of a project, or port existing projects to it?


Bunch of dead links to bitbucket...

But now I have another rabbit hole to deep dive down in the form of meta-tracing JITs.


Has Carl always had the "-Tereick" in his name or did he get married recently?


http://cfbolz.de/contact.html just mentions that he changed it last year but doesn't give a reason.


yes, he got married a while ago.


Yes, meta-tracing really works!

On the other hand, I think it is somewhat deterimental to the success of the idea of meta-tracing that really the only working production grade implementation of the idea is RPython. RPython is very idiosyncratic, to say the least. I think a good implementation of meta-tracing on boring technology (say, C++) would greatly help popularize the idea.


I'm just sad compatibility with data science tools (and scientific computing) isn't there afaik, numpy, pandas, scipy, tensorflow, scikit-learn, pytorch to name a few


Reread it then, PyPy has implemented emulation layer for the CPython which allows numpy, etc to be used.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: