Hacker News new | past | comments | ask | show | jobs | submit login

The bug is about two related floating-point concepts: contraction and excess precision.

The C and C++ standards mandate that floating-point operations follow the IEEE-754 spec. For example, the result of any elementary operation (+, -, *, /) must be the floating-point number "closest" to the exact result of the operation, for some definition of "closest" (i.e., according to a user-modifiable rounding mode). If we denote this rounding operation by fp64(), then the C code

    double x = a + b
results in

    x := fp64(a + b)
Furthermore, the C code

    double y = a + b * c
results in

    y := fp64(a + fp64(b * c))
These restrictive definitions have the huge advantage of allowing portability and determinism of floating-point operations: regardless of the platform, architecture, or compiler, the values of x and y are mandated down to the bit representation. Also, this most often does not come at any performance cost, since most architectures have IEEE-754-compliant instructions.

But then, there are exceptions. For example, the old x87 FPU instructions would allow one to do:

    y := fp64(a + fp80(b * c))
where fp80() uses the internal 80-bit x87 FPU registers. This is "excess precision" (80 instead of 64 bits), and this would generally be faster than the standard-compliant code. A more recent example, since Haswell, Intel CPUs have builtin fused multiply-add (FMA) instructions, allowing one to eschew the inner rounding altogether:

    y := fp64(a + b * c)
This is a case of "contraction" in GCC parlance, and it is also generally slightly faster that the standard-compliant code.

Both "excess precision" and "contraction" seem like win-wins: more accuracy and more performance. However, since we cannot not control exactly when the compiler applies them and when it does not, it comes at the cost of portability / determinism / reproducibility. Even just a compiler version change could give you (slightly) different results.

By default (in "GNU" mode), GCC enables both excess precision and contraction. However, in C, it now exposes command-line options to disable them. Also, specifying a standard (e.g. -std=c99) disables them both. However, no such option existed in C++. From the man page:

    -fexcess-precision=standard is not implemented for languages other than C.
The commit linked in the bug report adds the options for C++.*



Examples on Compiler Explorer:

By default, GCC generates FMA instructions on Haswell and later:

https://godbolt.org/z/GKb7G4nW9

but it does not with -std=c99 on the command line:

https://godbolt.org/z/KTnqcT6aW

Similarly on 32-bit x86, floating-point uses x87 by default, and some intermediate calculations are in 80-bit arithmetic:

https://godbolt.org/z/4q31oEe14

With -std=c99 instead, we can see additional load/stores, which round down results to 64 bits:

https://godbolt.org/z/qdf4hceca




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: