A regression when the LLVM backend updated from 8 to 9. So arguably a regression...

jedisct1 · on March 29, 2021

Zig does a really good job at finding bugs in LLVM every time the LLVM version is updated.

And Zig developers also do a very good job at reporting and fixing them. Which is why contributing to Zig also means contributing to LLVM.

rob74 · on March 29, 2021

Language developers finding bugs in LLVM is ok, but if there are already several instances of regressions in LLVM found by developers of (maybe "exotic") languages, then the reaction should be less "hey, well done, you found a bug" and more "how can we avoid such regressions in the future?"...

MaxBarraclough · on March 29, 2021

I imagine Clang and GCC have growing test-suites to defend against regressions. I wonder if they share them.

wyldfire · on March 30, 2021

They both do, yes.

LLVM's [1] includes (some of?) gcc's, the reverse may also be true.

LLVM also has extensive unit tests that can exercise codegen and backend functionality.

[1] https://github.com/llvm/llvm-test-suite

sesuximo · on March 30, 2021

Clang writes a subset of LLVM IR, and other types of IR are supported but very poorly optimized/tested/etc. The LLVM docs say you should generate what Clang generates to avoid regressions.

PoignardAzur · on March 29, 2021

Separate the backend into functional subcomponents and fuzz them as much as possible?

It's what cranelift does.

wyldfire · on March 30, 2021

Fuzzing is great for finding crashes and other catastrophic misbehavior. But suboptimal codegen like this would be difficult to reveal with fuzzing.

You could do it with a reference compiler (this has in fact been done before) but finding suboptimal codegen like this case would still be kinda tricky.

tom_mellior · on March 30, 2021

Based on other comments this seems to be a regression in LLVM. LLVM does have a reference compiler for regressions: It's the previous version of LLVM. (Not saying that that makes this trivial.)

wyldfire · on March 30, 2021

But if this next release makes codegen improvements, those would all show up as differences. Separating the improvements from the regressions is difficult enough -- fuzzing doesn't really help here, it makes it much harder to determine what code should have been generated.

Generally, codegen issues are exposed by benchmarks or someone who is curious enough to examine and analyze the code generated from multiple compilers or multiple releases. The latter is much rarer. But as these happen, the bar keeps getting raised and we do grow the test suite.

Google234 · on March 30, 2021

For those reading now it’s not LLVM’s fault. It was a rust change that disabled it