You can't prove the effect of "temporary arrays, choice of language, and paralle...

newen · on Jan 5, 2018

>You can't prove the effect of "temporary arrays, choice of language, and parallelism" and "pointer indirection, cache efficiency, and SIMD vectorization" don't matter unless you compare one thing that does those things to another thing that doesn't.

I think he said the exact opposite. He's basically saying algorithmic time complexity matters (and therefore identifying your problem class matters, since you might get a more efficient algorithm out of it) since you can only get so far with the generic algorithm, however good your implementation is.

Veedrac · on Jan 5, 2018

The argument is symmetric; once you've tuned the algorithm the only way to get >10x speedups is by working with those low-level things.

Both the broad-strokes algorithm and the fine details are responsible for huge performance differences.