Hacker News new | past | comments | ask | show | jobs | submit login
Can you safely parse a double when you need a float? (lemire.me)
28 points by ibobev on Dec 1, 2021 | hide | past | favorite | 4 comments



> With floating-point numbers, I get this effect with the string “0.004221370676532388” (for example). When I round it directly to a float, I get 0.0042213709093630313873291015625. If I first round it to a double and then to a float, I get 0.0042213709093630313873291015625. You probably cannot tell unless you are a machine, but parsing directly to a float is 2e-7 % more accurate.

I scratched my head at this as those are literally the same number.

Looking into it a bit, the correctly rounded float should be 0.0042213709 and the incorrect, twice-rounded is 0.0042213704.

To see what is going on the number in question is in binary base

  0.00000001000101001010011011011001000000000000000000000000000001...
  Rounded to double is
  0.000000010001010010100110110110010000000000000000000000000000
  And when that is rounded to float (round to even mode) you get
  0.0000000100010100101001101101100
  While the correct number is
  0.0000000100010100101001101101101


Double rounding also occurred with the Intel x87 FPU (before compilers switched to using Intel SSE by default), which used 80-bit floating point registers internally that would then be rounded back to 64-bit on stores or spills. Because of that, you'll find double rounding discussed in FPU books and papers.

But the error from double rounding should be tiny, and occurs extremely rarely. The error from a single rounding (as you would get from parsing the float32 directly) is 0.5 ULP, and the error with 64-bit to 32-bit double-rounding should be 0.5 + 2^-30 ULP (an extra 2^-30 ULP) [1].

Comparing the two resulting 32-bit floats gives you the wrong impression about the actual additional error from double-rounding (the 2^-30 ULP, compared to the 0.5 ULP from going from a real number to floating point). For double-rounding to have any effect at all the real input number has to be extremely close to the midpoint of those two 32-bit floats, and the double rounding really just nudges it ~0.5 ULP's down instead of ~0.5 ULP's up (or vice versa).

And the chances of having a double rounding that affects the output is tiny; on the x87 it was approximately 1 in 4096 [2], and for float64 to float32 it's closer to one in a billion.

([1] When double rounding is odd - Normale, et al. - 2004, [2] Handbook of Floating-Point Arithmetic - Muller, et al. - 2010)


You were able to control the mantissa precision in x87. I know some people think it was for perf, but the primary reason to control it is to avoid this exact problem.


I suppose this comes up for sqlite, which doesn't appear to have a native float type. The aliases for float and double are both 8 byte IEEE.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: