Not to mention the usual problems with Phoronix benchmarks: it doesn't say how m...

vinkelhake · on Aug 19, 2022

Every result graph there has the error indicated. If there are any significant errors then bars are shown. You can see this on the very first result page for LeelaChessZero.

It also shows you the number of runs. It also shows you the compile options used. All this info is included in every graph.

The complete system setups are described. The test suite is also open source.

https://github.com/phoronix-test-suite/phoronix-test-suite

paulmd · on Aug 19, 2022

I remember a recent "gaming on linux" article from them where they were computing "summary" geomeans including benchmarks across different resolutions... from the same game. So you might have:

* SOTTR 1080p

* SOTTR 1440p

* SOTTR 4K

* F1 1080p

...

And this wasn't like they had a 1080p geomean and then a 1440p geomean and a 4K geomean... they just had one geomean with a bunch of different resolutions thrown into it, including duplicates of the same game at different resolutions. And sometimes different combinations of resolutions for different games (they might skip 4K for a particular game, etc).

That's pleb-tier benchmarking, pick a random redditor and they know not to make that kind of mistake, it's obviously and facially incorrect.

It just goes to show the power of community goodwill... UserBenchmark's actual sub-scores are reasonably accurate, but because the owner is a massive fucking twat they're persona-non-grata in the internet community (I'm sure I'm going to be regaled with NO THEIR BENCHMARKS ARE TRASH AND HE CHANGES THINGS TO MAKE INTEL but nope, the subscores are accurate, topline "summary" score weights are what he fucks with). Michael Larabel is a very nice guy and frankly doesn't seem to understand the first thing about benchmarking, or score weighting, or mathematics, and constantly puts out trash-tier results with obvious defects, and he's revered in the community, basically a saint.

I know, nobody else is really benchmarking Linux and he's what we've got, if you don't like it then be the change, etcc. But, his results are given incredibly disproportionate weight to the quality there, he's no anandtech. And sadly anandtech is no anandtech anymore.

sudosysgen · on Aug 20, 2022

It's not...

Games are benchmarked at different resolutions for the same game because that shifts the CPU/GPU burden. It's a great thing to do and many benchmarks do it too, and yes in the same average.

If you don't include a game at those resolutions for no good reason that's one thing, but varying the resolution in the same mean is a good idea.