the most common optimization for release builds and the one used by Linux distru...

colejohnson66 · on Dec 2, 2022

> [...] Linux distrubutions is -O2, not -O3. this is justified by real world measurements [...]

No. It's because Linus fears -O3 for no reason. He even ordered the removal[1] of the -O3 Kconfig flag[0] (CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3) because "-O3 has a *loong* history of generating worse code than -O2"[1], whatever that means.

From [2]:

> Other upstream kernel developers also criticized that higher optimization level over the default -O2 level due to the risks, particularly with older compilers and memories from times when -O3 tended to be more buggy.

In other words, because of bugs in previous versions that have been fixed, we won't use the feature.

Linus and Co. are sticking their heads in the ground regarding -O3. He says he needs evidence -O3 is good, but doesn't actually provide evidence beyond hearsay that it's bad.

Just because Linus says -O3 is bad doesn't mean he's right.

[0]: https://www.phoronix.com/news/O3-Optimize-Kernel-2022-Patche...

[1]: https://www.phoronix.com/news/Linus-Against-O3-Kernel

[2]: https://www.phoronix.com/news/Linux-6.0-Drops-O3-Kconfig

gpderetta · on Dec 2, 2022

There is a lot of cargo culting in the -O2 decision.

But it also true that O3 enables a lot of loop optimizations that are not particularly relevant for the kernel. Also the kernel is less reliant of more aggressive inlining and interprocedural optimizations than, say, highly abstracted C++ code.

colejohnson66 · on Dec 2, 2022

For sure, -O3 isn't really necessary in the kernel, but saying, "it's so bad that we shouldn't even have a Kconfig flag for it" is a bit extreme

froh · on Dec 2, 2022

debugging ring0 code obfuscated by -O3 is another level of fun. ymmv, however the kernel guys are finding plenty of obscure bugs.

and they have been bitten by aggressive smart optimizations based off undefined behavior.

for example testing a variable for != null after dereferencing it makes no sense. if it was null, it was a segfault and the check is never reached, right?

  foo *x = f();
  x->y();
  if (x == null) {
    /* unreachable in user space! */
    do_sth_about_it();
    ...
  }

so the check can go. except when it can't:

https://news.ycombinator.com/item?id=33770589

here is another recent discussion on -O2 vs -O3 here on HN.

https://news.ycombinator.com/item?id=28895896

seriously, by default, -O2 is for production code. use -O3 only after measuring it makes _your_ code faster and introduces no obscure bugs.

mhh__ · on Dec 3, 2022

I view the null check stuff as being more of an example of the compilers/even kernel Devs not bothering to try and properly express their desires i.e. in this case the check is has wording in the standard covering it, so it should be very explicitly desired to remain in the binary (volatility or similar, although there are limits to what you can ergonomically express in C)

Similarly strict aliasing can change the behaviour of code but if you're genuinely relying on that your code is probably bad - either in the standards view, or in my view in that you can write the same code in a manner that won't cause any mischief (i.e. there are standard-friendly ways to do ugly pointer crap even if they mean memcpying pointers - which will then be eliminated by the optimizer)

froh · on Dec 3, 2022

> there are limits to what you can _ergonomically_ express in C...

indeed.

and in the context of this thread all I'm saying is that "-Oxxx is best, -Oyyy is outdated" is an oversimplification.

spookie · on Dec 2, 2022

If it led to bugs in the past, then the ones who need to defend their opinion are -O3 defendants.

Look, the guy usually comes around when shown facts. He's making sure it's a good decision.