There's nothing to say that all of CPython, Python 3, and PyPy must be measured.
For the moment, they all are being measured, so it's interesting to see that programs written for CPython might perform badly with PyPy, and programs written for PyPy might perform badly with CPython.
I think in general, programs written in "Python" will perform better in PyPy then CPython, but the current submissions are hyper-optimised for the implementation details of CPython.
For the moment, they all are being measured, so it's interesting to see that programs written for CPython might perform badly with PyPy, and programs written for PyPy might perform badly with CPython.