> However, to make the argument that an optimization is correct, the exact semantics of LLVM IR (what the behavior of all possible programs is and when they have UB) needs to be documented.
This is a great point as to why formal semantics of programming languages matters. Even if an optimization seems "obviously" correct, finicky things like introducing the possibility of UB can, as the post outlines, cascade into compiler passes that change the program more and more drastically. The post mentions that one need not go overly formal with the semantics, but I'll demonstrate what could happen when one does.
One possible formulation of semantics is denotational semantics, which describes how to map a program source syntax into values, i.e. eval : Expr -> Val. So when we have an optimization opt : Expr -> Expr, the desired correctness property for that optimization is that
Definition opt_correct (opt : Expr -> Expr) := forall (e : Expr), eval e = eval (opt e).
When we want to rewrite, say e + 0 ==> e, for any expression e, the correctness can be stated and proved
Not sure why you wanted to spend the time to prove "correct optimizations can be combined", but I take issue with that proof. It only works for optimizations that give code exactly the same behavior, which is severely limiting.
That is a serious mistake here, where the definition of a correct optimization is of significantly more interest than the result that they can be composed. We want to apply optimizations alone even more than we want to apply them together.
Putting things another way, this definition is clearly not given as part of a proof; it is given for its own sake, and the proof uses it.
One is that the as-is rule only says that code has to match a possible execution of the abstract machine. Let's say an optimization changes the address where a variable gets allocated. That's an extremely valid optimization, even though the program can observe the change. But that would make it fail the "eval e = eval (opt e)" rule in siraben's proof. The same for picking a different order to execute the functions in "f() + g()".
The other is optimizing around undefined behavior. The as-is rule only applies for valid inputs. Optimizing a loop by assuming you won't overflow would get rejected by that proof. So would optimizing code so that it won't segfault.
And depending on how exactly that eval test works, it might effectively mark every variable as volatile too.
> But that would make it fail the "eval e = eval (opt e)" rule in siraben's proof. The same for picking a different order to execute the functions in "f() + g()".
This is really a question of how the semantics are formulated. The eval function I gave doesn't take into account an abstract machine so there is no notion of "variable allocation" or "final state" to check, the semantics doesn't account for it.
To scale it to a more realistic model with nondeterminism, heaps and so on, the semantics needs to be changed to a relational one. For instance, eval would now be a relation that relates two states of the machine, and a proof of correctness would be something like[0], which takes into account all possible states of the heap.
Equality would no longer used to relate two "equivalent" programs but rather some other equivalence relation with the properties one cares about, for instance two programs would be heap-equivalent if they have exactly the same effect on the heap, or UB-equivalent if they have possible UB at the "same" (again under another relation) places.
This is a great point as to why formal semantics of programming languages matters. Even if an optimization seems "obviously" correct, finicky things like introducing the possibility of UB can, as the post outlines, cascade into compiler passes that change the program more and more drastically. The post mentions that one need not go overly formal with the semantics, but I'll demonstrate what could happen when one does.
One possible formulation of semantics is denotational semantics, which describes how to map a program source syntax into values, i.e. eval : Expr -> Val. So when we have an optimization opt : Expr -> Expr, the desired correctness property for that optimization is that
When we want to rewrite, say e + 0 ==> e, for any expression e, the correctness can be stated and proved One claim in the blog post that correct optimization passes, e.g. opt1 and opt2 compose into another correct optimization pass can be stated as: Which means given any two optimization passes opt1 and opt2 such that they are correct, composing them preserves correctness. The proof is simply;