Hacker News new | past | comments | ask | show | jobs | submit login

the most common optimization for release builds and the one used by Linux distrubutions is -O2, not -O3. this is justified by real world measurements, btw, alas I don't have the article link at hand on the smartphone. the quintessential learning from that article was: measure and ideally profile before going beyond -O2 .

and to see the size difference, I'd love to see -Os optimize for size used in comparison to -O2/-O3 which is unrolling loops and inlining static functions as it deems fit, beyond the inline keyword (which is a mere hint).

another paradoxical effect of increasing generated code size with aggressive optimizations is that you may outgrow caches: if you're unlucky paging into slow DDR ram becomes necessary in inner loops and the execution speed decreases.

I'd suggest to read the article with an extra grain of salt.




> [...] Linux distrubutions is -O2, not -O3. this is justified by real world measurements [...]

No. It's because Linus fears -O3 for no reason. He even ordered the removal[1] of the -O3 Kconfig flag[0] (CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3) because "-O3 has a *loong* history of generating worse code than -O2"[1], whatever that means.

From [2]:

> Other upstream kernel developers also criticized that higher optimization level over the default -O2 level due to the risks, particularly with older compilers and memories from times when -O3 tended to be more buggy.

In other words, because of bugs in previous versions that have been fixed, we won't use the feature.

Linus and Co. are sticking their heads in the ground regarding -O3. He says he needs evidence -O3 is good, but doesn't actually provide evidence beyond hearsay that it's bad.

Just because Linus says -O3 is bad doesn't mean he's right.

[0]: https://www.phoronix.com/news/O3-Optimize-Kernel-2022-Patche...

[1]: https://www.phoronix.com/news/Linus-Against-O3-Kernel

[2]: https://www.phoronix.com/news/Linux-6.0-Drops-O3-Kconfig


There is a lot of cargo culting in the -O2 decision.

But it also true that O3 enables a lot of loop optimizations that are not particularly relevant for the kernel. Also the kernel is less reliant of more aggressive inlining and interprocedural optimizations than, say, highly abstracted C++ code.


For sure, -O3 isn't really necessary in the kernel, but saying, "it's so bad that we shouldn't even have a Kconfig flag for it" is a bit extreme


debugging ring0 code obfuscated by -O3 is another level of fun. ymmv, however the kernel guys are finding plenty of obscure bugs.

and they have been bitten by aggressive smart optimizations based off undefined behavior.

for example testing a variable for != null after dereferencing it makes no sense. if it was null, it was a segfault and the check is never reached, right?

  foo *x = f();
  x->y();
  if (x == null) {
    /* unreachable in user space! */
    do_sth_about_it();
    ...
  }

so the check can go. except when it can't:

https://news.ycombinator.com/item?id=33770589

here is another recent discussion on -O2 vs -O3 here on HN.

https://news.ycombinator.com/item?id=28895896

seriously, by default, -O2 is for production code. use -O3 only after measuring it makes _your_ code faster and introduces no obscure bugs.


I view the null check stuff as being more of an example of the compilers/even kernel Devs not bothering to try and properly express their desires i.e. in this case the check is has wording in the standard covering it, so it should be very explicitly desired to remain in the binary (volatility or similar, although there are limits to what you can ergonomically express in C)

Similarly strict aliasing can change the behaviour of code but if you're genuinely relying on that your code is probably bad - either in the standards view, or in my view in that you can write the same code in a manner that won't cause any mischief (i.e. there are standard-friendly ways to do ugly pointer crap even if they mean memcpying pointers - which will then be eliminated by the optimizer)


> there are limits to what you can _ergonomically_ express in C...

indeed.

and in the context of this thread all I'm saying is that "-Oxxx is best, -Oyyy is outdated" is an oversimplification.


If it led to bugs in the past, then the ones who need to defend their opinion are -O3 defendants.

Look, the guy usually comes around when shown facts. He's making sure it's a good decision.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: