Hacker News new | past | comments | ask | show | jobs | submit login

> Summing floats should by default be taken to have error bounds, and any answer in those bounds is valid. If you know something special about the floats you are inputting, the language should have some means to explicitly encode that. It shouldn’t be the most basic loop, that’s the default case, so it should give the best performance.

This is a lot of "should" for something that basically doesn't happen. The only information you give is the order of arithmetic in the original expression.

It is a total nightmare if arithmetic is not stable between builds. Rebuilding the software and running it on the same input should not produce different results.

(Long ago I encountered the special Intel version of this: because the FPU used 80-bit registers internally but 64-bit in memory, changing when a register fill/spill happened would change when your answer got rounded, and thereby change the results. You can set a global FPU flag at the start of the program to force rounding on every operation.)






It does happen in languages and libraries with a higher level of abstraction. MKL for example will do whatever it wants for accumulations (which practically means that it’ll take advantage of SIMD because that’s a big reason why people use the library) unless you specifically request otherwise via there “Conditional Numerical Reproducibility” flag.

I think that was the right way to do it. BLAS made the right decision by defining these things in terms of sums and dot products instead of step-by-step instructions.

It will always be possible to write programs that run differently on different hardware or with different optimization levels. If somebody is writing code for floating point computations and expects exact bit patterns—it is possible, of course, all the rounding behavior is clearly specified. But usually this is an error.


> You can set a global FPU flag at the start of the program to force rounding on every operation

This doesn’t do quite the same thing. It still uses the wider exponent range of the 80-bit type.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: