For people who care about performance, Cython Numba and PyPy are a vital part of the Python ecosystem. They let you create highly performant code while retaining Python's clarity, learnability, and rapid development capabilities.
These gains don't come easily; instead, they are the result of years of thinking carefully about what the machine actually does internally and coming up with more direct paths to accomplish the same goal.
This has been an issue in Python talk selection for several years. One year the PyPy folks didn't have a single accepted talk despite having heroic accomplishments that will greatly affect Python's future. And last year, scientific and numeric talks almost non-existent despite the booming growth of PyData community and the wide adoption of Pandas.
Part of the reason is that there has been an effort to get new people on stage regardless of experience level and to get substantially more women on-stage as well.
The overall net effect has been positive, making the community more inclusive and giving more stage time to new, fresh talent. The downside is that there is less room for other players (for example, none of the proposed talks from Continuum were accepted).
I was actually considering the other day that I was surprised Cython (and numba?) don't do something where they use a copy of the libpython source to allow them to inline calls back into python-land. Yes, fraught with packaging/distribution difficulties, but possibly worth it for situations where the speedup is needed.
Er. Essentially impossible because the inlining only works at situations where you know a whole bunch of things at compile-time. CPython knows essentially nothing at compile-time.
It's not impossible. It's "easily" done if you have a JIT, but even without a JIT, you can inline the call on the bytecode level using similar techniques - you have to be able to rebuild the chain if someone asks for it. One can (easily) argue that the complexity is unnecessary and the speedups are unclear. It really depends how hard you try :-)
PyPy is generally achieving speedups mostly by:
* avoiding allocations by escape analysis
* avoiding escape through frames by removing frames
* avoiding another level of allocations by inlining calls (and avoiding frames)
For people who care about performance, Cython Numba and PyPy are a vital part of the Python ecosystem. They let you create highly performant code while retaining Python's clarity, learnability, and rapid development capabilities.
These gains don't come easily; instead, they are the result of years of thinking carefully about what the machine actually does internally and coming up with more direct paths to accomplish the same goal.
Thank you for your work.