I think there was some misunderstanding, you're arguing different points than on...

sillycross · on Dec 31, 2023

> I just meant that even being aware this is a naive copy-and-patch JIT, my first impression was that the code was slightly worse than I expected.

> "there’s only so much one can do without touching the data model"

You probably want to look at the other link in that PR, which demonstrated how well copy-and-patch can do for another dynamic language (Lua): [1]

Of course, whether or not CPython could eventually make it to that point (or even further) is a different story: they are under a way tighter constraint than just developing something for academia. But copy-and-patch can do a lot even for dynamic languages :)

[1] https://sillycross.github.io/2023/05/12/2023-05-12/

lifthrasiir · on Dec 27, 2023

> That's why in my initial message I said I wonder about expected peak improvement. I won't be surprised if it (together with theorized uop optimizations) barely exceeds single-digit percent perf gains, which would of course be still totally worth it. And even it's more, well, even better :) And in the worst case - which I hope won't happen - the point you mentioned is today, and copy-and-patch would never be worth enabling by itself.

Ah, so you meant that even all of them including specializing interpreter and copy-and-patch JIT may not give a reasonable speedup. But I think you have missed the fact that specializing interpreter has already landed on 3.11 and provided 10--60% speedup. So specialization really works, and copy-and-patch JIT should allow finer-grained uops which can have an enormous impact on performance.

On the other hand, it is possible that copy-and-patch JIT itself turns out to be useless even after all the work. In this case there is no other known viable way to enable JIT without disruption, so JIT shouldn't be added to CPython. I should have stressed this point more, but "incremental" improvements are really important---it was a primary reason that CPython didn't even try to implement JIT compilation for decades after all. CPython can give them up, but then there is one less reason to use (C)Python, so CPython never did so. (GIL is the same story by the way, and the current nogil effort is not possible without other performance improvements that outweigh a potential overhead in the single-threaded setting.)

> As for your example with integer adding, I totally agree with all you said, and that's exactly what I meant by "there’s only so much one can do without touching the data model".

If the data model refers to the publicly visible portion of the interface, I don't think so. Even JS runtimes didn't require any change to the public interface, and CPython itself already caches lots of the data model for the sake of performance. I'm not aware of attempts like shape optimizations, but it might be possible to extend the current `__slots__` implementation to allow the adaptive memory layout.

adrian17 · on Dec 27, 2023

> Ah, so you meant that even all of them including specializing interpreter and copy-and-patch JIT may not give a reasonable speedup. But I think you have missed the fact that specializing interpreter has already landed on 3.11 and provided 10--60% speedup

No, I'm talking compared to the current default production state. Exactly what Brandt said in his talk at around 23:30, and what I observed when building his branch.

lifthrasiir · on Dec 27, 2023

Then I'm not sure why that would refute the intermediate goal to "enable JIT codegen without sacrificing too much performance" as stated in my initial comment, since the proposed copy-and-patch JIT compiler can't make the impact by itself.