Except from the "most optimized C++ code" you have all sources of all benchmarks...

scott_s · on June 3, 2011

I agree that it probably won't make a difference. But good experiments remove as many of those probablys as possible.

acqq · on June 3, 2011

The parent commenter uses "improved branch prediction, cache prefetching, better pipelining, and different cache sizes" as a "mumbo-jumbo that can mean something different." I'm in the business, so I can tell you, most of the improvements give you just some "overall speedup" so that you can happily buy today's processor running on 3 GHz and be glad it's faster than almost a decade old P4 running on the same 3 GHz. Add to that that now you have a multi-core CPU and that you have to "clear" the paths to the cores in order to prevent them from slowing down one another and also to compensate the bigger delay introduced by more modern RAM technologies, which trade bigger delay for the possibility to feed more cores.

Then, measure the algorithms that run on one core, anyway, on the P4 and the latest Core iX. Your slow languages won't be faster than your fast ones just because the quoted changes were introduced to the processors in between.

scott_s · on June 3, 2011

Please note that I did not disagree with your conclusion - I agree that it probably won't make a difference. If it makes you feel better, I'll say it's a very high value of probably. But I'm in the business of performing systems experiments. Removing as many variables as possible is just good experimental design. If you want to know what the performance will be like on modern machines, then it's best to run on modern machines.

onan_barbarian · on June 3, 2011

Dear lord, thank you for this bit of common sense.

It's actually quite hard to know what a given piece of code will do on a given microarchitecture even if on average it runs everything X% faster - you may find you're the bit of code that bites the big one and run X% slower on the new microarchitecture (e.g. you were depending on branch mispredicts being cheaper than they are) or suddenly your code runs way faster than competing codes (e.g. you're the superstar running 2X% faster because a sudden increase in ILP exposes that you've got a main loop full of independent operations).

acqq · on June 3, 2011

Thought experiment: you are Intel and your new processor makes, who knows how, writing highly optimized C code unnecessary, since the speed up of some higher level languages (from now on Java or Scala) is bigger than the speed up of the highly optimized C code (in which the runtime of Java and Scala are implemented). Wouldn't you make the biggest announcement of the computing history? "With this new CPU, your sloppier written code is faster than a highly optimized code. Every programming assumption valid up to now isn't anymore!"