As I do with every PyPy release, I would like to point out that the PyPy official benchmarks for comparison against CPython[0] continue to misleadingly compare their latest and greatest with CPython 2.7.2 (released in 2011), as opposed to the modern CPython 2.7.13 or 3.5.3 versions for which they target API compatibility.
We should indeed compare PyPy3.5 vs CPython 3.5.3, but having a benchmark suite that works on both continuous to be a problem.
Regarding 2.7.13 - you might find it surprising, but it's actually SLOWER than 2.7.2, there has been no speed improvements and quite a few speed decreases, so we decided to keep the faster one.
EDIT: part of the problem is that comparing PyPy 2 vs CPython 3 is apples vs oranges, but PyPy3 is not ready yet (unicode improvements I'm working on right now are missing)
> Regarding 2.7.13 - you might find it surprising, but it's actually SLOWER than 2.7.2, there has been no speed improvements and quite a few speed decreases, so we decided to keep the faster one.
I don't find it that surprising, but do find it disappointing that you would run benchmarks against the current version, but not post them online for perusal, nor provide any sort of explanation for the use of the older version in head-to-head comparisons. For me, at least, it produces the impression that PyPy has something to hide, and I doubt I'm the only one.
This is the usual case of budget - if I had budget to have anyone improve the website, improve the buildbot, improve the benchmark comparison, trust me I would do it. Right now there are no volunteers and the benchmark side is sort of lingering on.
I remember a FOSDEM conf. where someone of PyPy said they were working with a very low budget, I bet they have no time to spend on fixing a very superficial bad impression. You don't come to PyPy just to try, you come to PyPy to boost performances of some existing projects.
It seems like the benchmarks should be reasonably reproducible, right? (In the sense that if they're not, then that's a bigger complaint about their validity.)
Can you or I just run the benchmarks against 2.7.13 to see how it goes?
[0]http://speed.pypy.org