Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

    > 0.1 + 0.1 + 0.1 == 0.3
    
    False
I always tell my students that if they (might) have a float, and are using the `==` operator, they're doing something wrong.


That has more to do with decimal <-> binary conversion than arithmetic/comparison. Using hex literals makes it clearer

     0x1.999999999999ap-4 ("0.1")
    +0x1.999999999999ap-4 ("0.1")
    ---------------------
    =0x3.3333333333334p-4 ("0.2")
    +0x1.999999999999ap-4 ("0.1")
    ---------------------
    =0x4.cccccccccccf0p-4 ("0.30000000000000004")
    !=0x4.cccccccccccccp-4 ("0.3")


Absolutely nobody will think this is 'clearer', this is a leaky abstraction and personally I think that the OP is right and == in combination with floating point constants should be limited to '0' and that's it.


We all know that 1/3 + 1/3 + 1/3 = 1, but 0.33 + 0.33 + 0.33 = 0.99. We're sufficiently used to decimal to know that 1/3 doesn't have a finite decimal representation. Decimal 1/10 doesn't have a finite binary representation, for the exact same reason that 1/3 doesn't have one in decimal — 3 is co-prime with 10, and 5 is co-prime with 2.

The only leaky abstraction here is our bias towards decimal. (Fun fact: "base 10" is meaningless, because every base calls itself base 10)


> Fun fact: "base 10" is meaningless, because every base calls itself base 10

Maybe we should name the bases by the largest digit they have, so that we are using base 9 most of the time.


Repeating the exercise with something that is exactly representable in floating point like 1/8 instead of 1/10 highlights the difference.


> they (might) have a float, and are using the `==` operator, they're doing something wrong.

Storage, retrieval, transmission, and serialization/deserialization systems should be able to transmit and round-trip floats without losing any bits at all.


Floats break the basic expectation of == for round-trip verification, not due to programmer error, but because NaN is non-reflexive by spec. A bit-perfect round-trip can reproduce the exact bit pattern and still fail an equality check. The problem is intrinsic to the type, not the operator.


Well, there are many legitimate cases for using the equality operator. Insisting someone is doing something wrong is downright wrong and you shouldn't be teaching floating-point numbers. A few use cases are: Floating-points differing from default or initial values and carrying meaning, e.g. 0 or 1 translates to omitting entire operations. Then there is also the case for measuring the tinyest possible variation when using relative tolerances are not what you want. Not exhaustive. If you use == with fp, it only means you should've thought about it thoroughly.


I also like how a / b can result in infinity even if both a and b are strictly non-zero[1]. So be careful rewriting floating-point expressions.

[1]: https://www.cs.uaf.edu/2011/fall/cs301/lecture/11_09_weird_f... (division result matrix)


Anything that overflows the max float turns into infinity. You can multiply very large numbers, or divide large numbers into small ones.


Sure, division might be a tad more surprising though since most don't do that on an every-day basis. The specific case we had was when a colleague had rewritten

  (a / b) * (c / d) * (e / f)
to

  (a * c * e) / (b * d * f)
as a performance optimization. The result of each division in the original was all roughly one due to how the variables were computed, but the latter was sometimes unstable because the products could produce denomalized numbers.


There’s plenty of cases where ‘==‘ is correct. If you understand how floating point numbers work at the same depth you understand integers, then you may know the result of each side and know there’s zero error.

Anything to do “approximately close” is much slower, prone to even more subtle bugs (often trading less immediate bugs for much harder to find and fix bugs).

For example, I routinely make unit tests with inputs designed so answers are perfectly representable, so tests do bit exact compares, to ensure algorithms work as designed.

I’d rather teach students there’s subtlety here with some tradeoffs.


I have a relaxed rule for myself: if I’m using the == operator on floats, I must write a comment explaining why. I use == for maybe once a year.


.125 + .375 == .5

You should be using == for floats when they're actually equal. 0.1 just isn't an actual number.


> 0.1 just isn't an actual number.

A finitist computer scientists only accepts those numbers as real that can be expressed exactly in finite base-two floating point?


Yes. A computer scientist should know how numbers are represented and not expect non-representable numbers in that format to be representable.

0.1 is just as non-representable in floating point as is pi as is 100^100 in a 32 bit integer.

Terminating dyadic rationals (up to limits based on float size) are the representable values.


I’m not sure if you got my joke but I referred to the mathematical philosophy, finitism.


That's essentially what you already do for integer arithmetic.


0.1 is of course a real number, but let A \in R the set of actual numbers... (/s)


The funny thing is, according to infinitists real numbers are not real. But I do like the concept of the set of actual numbers.


I don't know whether I'm an infinitist, but I personally think "real numbers" is the most ingenious marketing term created by mathematicians...


\infty is an actual number not in R


Should be \subset.


Are you saying that my students should memorize which numbers are actual floats and which are not?

    > 1.25 * 0.1
    
    0.1250000000000000069388939039


> Are you saying that my students should memorize which numbers are actual floats and which are not?

Yes.


Your students should be able to figure out if a computation is exact or not, because they should understand binary representation of numbers.


(Another small note.... 1.25 * 0.1 is not representable because 0.1 is not representable, so that doesn't divide by 10)

1.25 = 2^0 + 2^-2, so is representable.

0.125 = 2^-3, so is representable

1.25 / 10.0 = 0.125 so is representable. 10.0 = 2^3 + 2^1.

1.25 * 0.1 is not representable, because 0.1 is not representable, and those low order bits show up in the multiplication


If they were taught what was representable and why they’d learn it quickly. And those that forget details later know to chase it down again if they need it. Making it voodoo hides that it’s learnable, deterministic, and useful to understand.


Tell them that they can only store integer powers of 2 and their sums exactly. 2^0 == 1. 2^-2 == .25. Then say it's the same with base 10. 10^-1 == 0.1. 1/9 isn't a power of 10, you you can't have an exact representation.


They shouldn’t “memorize” this per se, but it should take them only a few seconds to work out in their head.


I would argue that

    double m_D{}; [...]

    if (m_D == 0) somethingNeedsInstantiation();
can avoid having to carry around, set and check some extra m_HasValueBeenSet booleans.

Of course, it might not be something you want to overload beginner programmers with.


Yeah I'd argue that the beginner friendly version of the rule is probably "Never use exact == or != for floating point variables" and the slightly more advanced one is "Don't use it unless the value you are comparing to is the constant 0.0".


Before isnan() the Fortran test for NaN was (x .ne. x), assuming an IEEE 754 implementation.


I wish that (still) worked reliably, but it can unfortunately get one into trouble with some compilers and some optimization modes that assume that NaNs are undefined behavior.


I have a linter in my code that shouts at me if I use exact equality for floats.

But I regret not making an exception for the constant zero, because it's one of the cases where you probably should accept it. I.e. if (f != 0.0) {...}


Zero shouldn't be an exception there. If f had been set from something like f = a - b, then you're in the same situation where f might be almost but not exactly zero.

The linter wouldn't know where f came from, so it should flag all floating point equality cases, and have some way that you can annotate it for "yeah this one is okay."


The thing is that

if (f == 0.0) means "is f exactly zero so it's not initialized" 99 times for every one time it means "is f zero-ish because of a cancellation/degeneracy/whatever"

I just found that I have now annotated it for "yeah this one is ok" about 100 times, and caught zero cases where I meant to do a comparison to zero-or-very-nearly-so but accidentally wrote == 0.0.

So my conclusion is: I would have had less noise in my code with that exception in the linter, and the linter had been equally useful.


The idea is not to do it with values derived from arithmetic, but e.g. from measurements where a real zero is very unlikely and indicates something different.


Right, but that's not something a linter could know.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: