Hacker News new | past | comments | ask | show | jobs | submit login

Fused multiply add applies equally to scalar and vectorized code (and C actually allows compilers to fuse them; there's -ffp-contract=off / the FP_CONTRACT pragma to turn that off); the compiler/autovectorizer can trivially just leave multiply & add as separate if so requested (slower than having them fused? perhaps. But no impact at all on scalar vs vector given that both have the same fma applicability).

For <math.h> errno, there's -fno-math-errno; indeed included in -ffast-math, but you don't need the entirety of that mess for this.

Loops with a float accumulator is I believe the only case where -ffast-math is actually required for autovectorizability (and even then iirc there are some sub-flags such that you can get the associativity-assuming optimizations while still allowing NaN/inf).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: