My 2 cents as an implementer of an alternate Python implementation -- I am skeptical about this approach. We tried the "provide a limited API" approach in the past and found the following things:
- People actually use the "hard to implement" parts of the C API
- Moving things from macros to function calls can often be quite detrimental to performance
- C extensions have been co-optimized with the C API, so changing the C API will make things less optimized
- These "check if runtime debugging is enabled" checks are not free
My guess is that this will end up in a tough middle ground where extension writers will be faced with a tradeoff along the lines of making their extension 10% slower for 99% of their users in order to make things 2x faster for 1% of their users.
what is the python implementation you are involved with?
I agree with your concerns, HPy tries to address them since the beginning. Basically, there are two distinct compilation modes:
- CPython ABI: in this mode, things like HPy_Dup and HPy_Close are translated directly into Py_INCREF and Py_DECREF. The overhead is 0 both in theory and in practice, since all the benchmark that we ran so far confirmed this.
- HPy Universal ABI: in this mode, you introduce the indirections which makes it possible e.g. the debug mode. Our benchmarks indicate a 5-10% overhead, which is in line with what you (and we :)) expected.
So, if you are an extension writer, you will be able to distribute both CPython-opimized and universal binaries
- People actually use the "hard to implement" parts of the C API
- Moving things from macros to function calls can often be quite detrimental to performance
- C extensions have been co-optimized with the C API, so changing the C API will make things less optimized
- These "check if runtime debugging is enabled" checks are not free
My guess is that this will end up in a tough middle ground where extension writers will be faced with a tradeoff along the lines of making their extension 10% slower for 99% of their users in order to make things 2x faster for 1% of their users.