Author here; happy to answer any questions.

CalChris · on Dec 4, 2021

clang -S -emit-llvm xyz.cpp generates an optnone attribute.

  attributes #0 = { noinline norecurse nounwind optnone ssp uwtable "darwin-stkchk-strong-link" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "probe-stack"="___chkstk_darwin" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

This means you can't then run it through opt and watch it pass by pass get optimized with opt --print-after-all -O3 xyz.ll -S -o abc.ll. Clang doesn't take the --print-after-all flag.

What is the purpose of the optnone attribute? Why is Clang preventing later optimization?

jcranmer · on Dec 4, 2021

The purpose is to keep optimization flags around if you're, say, compiling with LTO or PGO--cases where the regular compilation process will keep bitcode files around.

The usual trick I do is clang -S -emit-llvm -O2 -Xclang --disable-llvm-passes.

> Clang doesn't take the --print-after-all flag.

You have to prefix it with -mllvm, e.g., clang -mllvm -print-after-all -O2 foo.cpp.

woodruffw · on Dec 4, 2021

I’m not sure about the why, but it’s been that way for quite a while (since at least LLVM 5). You can explicitly override the optnone by passing `-disable-O0-optnone` to the underlying driver (i.e., you might need to pass `-Xclang` first.)

Edit: if I had to wager a guess, it’s because someone generated an IR module at some point with -O0 and then re-ran opt to perform a specific non-destructive analysis, at which point LLVM happily ran all of its default passes. So someone decided that initially building with -O0 is equivalent to saying “my code should never be optimized, even in subsequent invocations of the compiler.”

lelouch11 · on Dec 4, 2021

Hi, I had a general question regarding intrinsics which you might be able to answer. `llvm.stacksave` and `llvm.stackrestore` are used for implementing VLAs in C (to my knowledge). Is it required to emit those for variable size allocas, or could you skip them? If skipping is possible, how is the case handled when the basic block does not post-dominate an exit block?

woodruffw · on Dec 4, 2021

AFAIK, LLVM doesn’t actually require those intrinsics for VLAs — the `alloca` instruction can take a non-immediate for the stack object size. I can’t remember off the top of my head whether I’ve seen LLVM consistently include the intrinsics as well.

As for the dominator case: it’s possible that it does need them then. But for a really trivial VLA (like at the level of the function’s frame) it shouldn’t.

jcranmer · on Dec 4, 2021

The only case I recall seeing @llvm.{stacksave,stackrestore} use was when LLVM inlined a function with a non-entry-block-alloca into another function, where the intrinsics got added immediately before/after what used to be the function call.