>No it isn't; it costs an extra instruction to initialize that register or memor...

kazinator · on March 4, 2019

> GCC is doing the most efficient thing possible with eax here.

It seems that the most efficient thing with %eax is not to mention it in an instruction.

There may be some quirk/feature of modern Intel processors that clearing a register will dis-entangle it from considerations of prior hazards. So that is to say, when we execute 'xorl %eax, %eax', the processor knows that any prior value in `%eax` is no longer required by subsequent code. A new lifetime has begun for that register. Therefore, there is no need to execute a pipeline stall to wait for that prior value, if it so happens that that prior value is not yet ready. Internally to the processor, the ISA-level register `%eax` can be assigned to a different internal register at that point, under the discipline of "register renaming".

In the absence of any such hardware considerations, the clearing of %eax is pure waste.

> it has to initialize eax before returning

Where is that requirement coming from? Not from ISO C, certainly. The behavior is undefined. It was the programmer's job to ensure that the return value is calculated by a well-defined expression, not that of the compiler. From that standpoint alone, it's perfectly conforming to emit code that just leaves the existing content in %eax and returns.

gpderetta · on March 4, 2019

> There may be some quirk/feature of modern Intel processors

not so much of a quirk, but basic behaviour of any OoO processor due to register renaming. This is true on the vast majority of intel (and AMD) cpus (in fact probably all the currently sold ones as even atom has acquired a limited OoO engine and the knight variants are no more).

It is so fundamental that, as described elsethread zeroing a register is almost free (it still costs decode bandwidth and icache size).

Bottom line in the vast majority of cases, zeroing the registers is almost always a win.

edit: having said that I doubt that gcc is trying to optimize the UB case, probably it the optimizer just requires that the variable has a value at that point.

tux3 · on March 4, 2019

>It seems that the most efficient thing with %eax is not to mention it in an instruction.

I think what you're missing is that UB is a property of the execution of the program, and not just of the code that was compiled.

If in practice the if branch is always taken, then there is NO undefined behavior (surprisingly, perhaps)!

That means GCC has to be prepared to handle that case, and has to set eax to 0 when the if branch is taken.

>Where is that requirement coming from? Not from ISO C, certainly. The behavior is undefined.

Not so! You don't know until the program is run.

-

Edit: (removed some speculation about assuming the branch is always taken, that was not relevant)

>There may be some quirk/feature of modern Intel processors that clearing a register will dis-entangle it from considerations of prior hazards. So that is to say, when we execute 'xorl %eax, %eax', the processor knows that any prior value in `%eax` is no longer required by subsequent code. A new lifetime has begun for that register.

That's absolutely right, by the way. I think this was introduced with Sandy Bridge, in 2011.

kazinator · on March 4, 2019

No, I mean the behavior is undefined in that case when the function returns the indeterminate value, which was initialized anyway.

Look, GCC (what I have here: 7.3.0) is doing this even for the following trivial function:

    int undef(void)
    {
      int ret;
      return ret;
    }

With -O2 this turns out:

  xorl %eax, %eax
  ret

do we still know until run-time that this has UB?

tux3 · on March 4, 2019

>Look, GCC (what I have here: 7.3.0) is doing this even for the following trivial function:

You're absolutely right that in this case, it's wasteful. In fact, if you try Clang you will see that it only emits a ret.

Maybe GCC is doing you a courtesy (or maybe nobody taught it to completely omit an Undef return value, so it just picks zero!). Either way, it's unconditionally UB, GCC can do whatever it likes.

>do we still know until run-time that this has UB?

Well, there is no branch here, is there? I'm clearly not going to disagree with you on this one =]

To be clear: UB is still a property of the execution of the program, but here it's not hard to statically determine that all executions of this program are Undefined Behavior.

kazinator · on March 4, 2019

Moreover, it still does it with "-m32 -mtune=i386". :)