Hacker News new | past | comments | ask | show | jobs | submit login

> NaN/infinity values can propagate and cause chaos

NaN is the most misunderstood feature of IEEE floating point. Most people react to a NaN like they'd react to the dentist telling them they need a root canal. But NaN is actually a very valuable and useful tool!

NaN is just a value that represents an invalid floating point value. The result of any operation on a NaN is a NaN. This means that NaNs propagate from the source of the original NaN to the final printed result.

"This sounds terrible" you might think.

But let's study it a bit. Suppose you are searching an array for a value, and the value is not in the array. What do you return for an index into the array? People often use -1 as the "not found" value. But then what happens when the -1 value is not noticed? It winds up corrupting further attempts to use it. The problem is that integers do not have a NaN value to use for this.

What's the result of sqrt(-1.0)? It's not a number, so it's a NaN. If a NaN appears in your results, you know you've got mistake in your algorithm or initial values. Yes, I know, it can be clumsy to trace it back to its source, but I submit it is better than having a bad result go unrecognized.

NaN has value beyond that. Suppose you have an array of sensors. One of those sensors goes bad (like they always do). What value to you use for the bad sensor? NaN. Then, when the data is crunched, if the result is NaN, you know that your result comes from bad data. Compare with setting the bad input to 0.0. You never know how that affects your results.

This is why D (in one of its more controversial choices) sets uninitialized floating point values to NaN rather than the more conventional choice of 0.0.

NaN is your friend!




> But let's study it a bit. Suppose you are searching an array for a value, and the value is not in the array. What do you return for an index into the array? People often use -1 as the "not found" value. But then what happens when the -1 value is not noticed? It winds up corrupting further attempts to use it. The problem is that integers do not have a NaN value to use for this.

You return (value, found), or (value,error) or Result<T>.

>NaN has value beyond that. Suppose you have an array of sensors. One of those sensors goes bad (like they always do). What value to you use for the bad sensor? NaN. Then, when the data is crunched, if the result is NaN, you know that your result comes from bad data. Compare with setting the bad input to 0.0. You never know how that affects your results.

You return error and handle the error. You want to know sensor is wonky or returns bad data.

Also you technically should use signalling NaN for "this is error" and quiet NaN for "this just impossible math result", which makes it even more error prone. Just return fucking error if it is a function.

Sure, useful for expressions but the handling should be there and then, and if function can have error it should return it explicitly as error, else you have different error handling for different types of functions.

> This is why D (in one of its more controversial choices) sets uninitialized floating point values to NaN rather than the more conventional choice of 0.0.

I'd like to see how much of the code actually uses that as a feature and not just sets it to 0.0 (or initializes it right away)


> You return (value, found), or (value,error) or Result<T>.

And this is great for environments that can support it, but as the levels get lower and lower, such safety nets become prohibitively expensive.

Take data formats, for example. Say we have a small device that records ieee754 binary float32 readings. A simple format might be something like this:

    record     = reading* terminator;
    reading    = float(32, ~) | invalid;
    invalid    = float(32, snan);
    terminator = uint(32, 0xffffffff);
We use a signaling NaN to record an error in the sensor reading, and we use the encoding 0xffffffff (which is a quiet NaN) to mark the end of the record.

If we wanted the validity signaling to be out-of-band, we'd need to encode it as such; perhaps as a "validity" bit preceding each record:

    record     = reading* terminator;
    reading    = valid_bit & float(32, ~);
    valid_bit  = uint(1, ~);
    terminator = uint(1, 1) & uint(32, 0xffffffff);
Now the format is more complicated, and we also have alignment problems due to each record entry being 33 bits. We could use a byte instead and lose to bloat a little:

    record     = reading* terminator;
    reading    = valid_bit & float(32, ~);
    valid_bit  = uint(8, ~);
    terminator = uint(8, 1) & uint(32, 0xffffffff);
But we're still unaligned (40 bits per record), which will slow down ingestion. We could fix that by using a 32-bit validity "bit":

    record     = reading* terminator;
    reading    = valid_bit & float(32, ~);
    valid_bit  = uint(32, ~);
    terminator = uint(32, 1) & uint(32, 0xffffffff);
But now we've doubled the size of the data format.

Or perhaps we keep it as a separate bit array, padded to a 32-bit boundary to deal with alignment issues:

    record      = bind(count,count_field) & pad(32, validity{count.value}, padding*) & reading{count.value};
    count_field = uint(32,bind(value,~));
    reading     = float(32, ~);
    validity    = uint(1, ~);
    padding     = uint(1, 0);
But now we've lost the ability of ad-hoc appends (we have to precede each record with a length), and the format is becoming a lot more complicated.


> And this is great for environments that can support it, but as the levels get lower and lower, such safety nets become prohibitively expensive.

I don't think there is low enough level here anymore. $0.3 micros maybe ? Why you're doing floats on them ? $2-3 micros already can have 64kB flash. You're just inventing imaginary cases to support "let's just do it worse and uglier" case.


Float-with-NaN is essentially (float-without-NaN, boolean error), in an efficient, hardware-optimized structure.


> What do you return for an index into the array?

None in an Optional/Maybe, and Error in a Result, Null in a nullable type, an exception,


I concur. I would use an `Option`, `Result`, or enum for all of those scenarios, depending on the details.


No way. It'd be ridiculously slow to constantly check for NaN let it propagate and then use a result or option at higher level.

Adding a branch like that to low level number crunching would be bonkers.


Indexing an array isn't low level number crunching, and whatever you produce from that is going to be the result of a branch unless you jist don't boubds check arrays ans return some random bit of memory (or crash because you indexed out of program memory.)

Honestly, ideally you have dependent types and index out of the array is a type error caught at compile time, but where its not, “NaN” is not the most logical result, even for floats, because very often you will want to distinguish “the index given was out of the array” from “the index given was in the array and the value stored was a NaN”; special values IN the domain of the values stored in the array are fundamentally problematic for that reason.


I think the concept of NaNs are sound, but I think relying on them is fraught with peril, made so by the unobvious test for NaN-ness in many languages (ie, "if (x != x)"), and the lure of people who want to turn on "fast math" optimizations which do things like assume NaNs aren't possible and then dead-code-eliminate everything that's guarded by an "x != x" test.

Really though, I'm a fan, I just think that we need better means for checking them in legacy languages and we need to entirely do away with "fast math" optimizations.


I call them "buggy math" optimizations. The dmd D compiler does not have a switch to enable buggy math.


> made so by the unobvious test for NaN-ness in many languages (ie, "if (x != x)")

Which languages do not have a function to test for NaN?

> and the lure of people who want to turn on "fast math" optimizations which do things like assume NaNs aren't possible and then dead-code-eliminate everything that's guarded by an "x != x" test.

This is not unique to NaNs. There are plenty of potential floating point problems if you enable those flags.


> Which languages do not have a function to test for NaN?

Both C and C++.

> This is not unique to NaNs. There are plenty of potential floating point problems if you enable those flags.

That's why I said in the second part that we need to do away with them.


C has had isnan() for over two decades. It’s technically a macro, but that doesn’t matter for this use case.


> Both C and C++.

Seems C++ only added it in C++11. Surprising.

> That's why I said in the second part that we need to do away with them.

Do away entirely with fast math calculations? That would be horrible. They exist for a very good reason: Some applications are too slow without them. Enabling subnormal numbers can really slow things down.

I'd wager that for the majority of programs written, the fast math is as good as the accurate math.


> I'd wager that for the majority of programs written, the fast math is as good as the accurate math.

I'd take that wager. I spent 13 years working on video games and video game technology, a domain where floating point performance is critical, and by and large we never used fast-math because of the problems it created for us.


This is surprising to me! Can you explain what problems you encountered? My (limited) understanding is that the main effect of fast-math is to turn off support for subnormals. Since subnormals are only used to represent extremely small values, I wouldn't expect them to have much effect in the real world.


fast-math can result in many things you may not expect, e.g. treating FP operations as being associative and distributive, which they aren’t.


Sure but when does that matter in video game math?


Happy to take that wager, because the majority of programs written are not video games related :-)


I’m more curious about how it caused problems for you if you ever used it


> Seems C++ only added it in C++11

I added it to Digital Mars C and C++ in the early 1990s. Also full NaN support in all the math.h functions. AFAIK it was the first C compiler to do it.

It was based on a specification worked out by NCEG (Numerical C Extensions Group), a long forgotten group that tried to modernize C numerics support.

> the fast math is as good as the accurate math

I was more interested in correct results than fast wrong answers.


Yes, but for most C++ applications, they prefer the faster wrong answers, because the wrong answers are not wrong enough to cause any bugs.


> Some applications are too slow without them.

I fully expect the intersection between programs where floating point instructions are a speed bottleneck and programs where the numerical instability that fast_math can cause is not a problem is the empty set.


IIRC we used -ffast-math for numerical simulations of radar cross sections. Fast math was a decent perf win but didn't have any negative effects on the results.

Most programs don't care about the difference between ((a+b)+c) vs (a+(b+c)). Why bother with NaNs if you know you can't get them? Etc.


Something akin to fast-math is the default for some shading languages (although some have switched to fast-math-but-preserve-NaN-and-INF).


isnan() has been around since C99.


isnan is a macro, not a function


isnan is a macro in C, function in C++. But either way, it’s an abstraction that provides a standard test for NaNs in those languages.


I believe the reason he is pointing out this distinction is that if it is a macro, it can be optimized out causing bugs.


Sure with -fast-math or similar compiler options without a way to also say “preserve NaNs”, that could happen with macros, but it can also happen with inline functions where the body is in a header file.


Not in a sound implementation. E.g. in macOS/iOS’s libm, we replace the isnan() macro with a function call if you compile with fast-math precisely to prevent this.


NaNs are incredibly helpful in numpy. Often I want to take a mean over a collection of lists but the size of each list isn't constant. Np.nanmean works great.


> I think the concept of NaNs are sound, but I think relying on them is fraught with peril

Well, allegedly in D at least https://www.reddit.com/r/rust/comments/a1w75c/the_bug_i_did_...


> What's the result of 1.0/0.0? It's not a number, so it's a NaN

It's not often that I get to correct Mr D himself, but 1.0/0.0 is...


You're right. I'll fix it.


What is 1.0/0.0 in D? A DivideByZero exception of some sort?


Infinity.


I often find myself wondering wtf? when I see discussions like this.

root -1 is i - and we get into complex numbers. If I ever end up with needing to deal with a square root of a number that might be -1 then I'll do it the old fashioned way and test for that and sort it. What is the problem here? root -1 is well defined!

Equally (lol) 1.0/0.0 and 1/0 can be tricky to handle but not beyond whit of person. In those cases it is syntax and and a ... few other things. Is 1.0 === 1.00 etc? Define your rules, work in the space that you have defined and all will be fine.

It is a complete nonsense to require "real" world maths to be correct in your program. Your program is your little world. If it needs to deal with math like concepts then that is down to you how it works. In your programming language you get some methods and functions and they often have familiar names but they do not work like the "real thing" unless you get them to.

numpy and the like exist for a very good reason: A general purpose programming language will cock up very soon.

Do basic and very simple arithmetic in your chosen language if it is well formed, switch to sophisticated addons as soon as is practicable.


> root -1 is well defined!

Yes, if you're using complex number types. Nope if you're using reals:

    double x = sqrt(1.0);


> This means that NaNs propagate from the source of the original NaN to the final printed result.

An exception would be better. Then you immediately get at the first problem instead of having to track down the lifetime of the observed problem to find the first problem.


Yeah, or using a type system that lets you avoid making NaN a float to both force the case to be handled and prevent NaN from leaking into the math.


> An exception would be better.

It depends on the context. Sometimes NaNs are expected and to be ignored. Sometimes they signal a problem.


Definitely. Unfortunately, Language implementations that guaranteed exceptions were not in wide use at the time. Also, to have a chance at being implemented on more than one CPU, it had to work in C and assembly.


You can get an exception! Just enable FP exceptions for the invalid exception, and compile all of your code such that it's FP-exception aware, and you can get the exception.


> This is why D (in one of its more controversial choices) sets uninitialized floating point values to NaN rather than the more conventional choice of 0.0.

This is definitely something I like about D, but I'd much prefer a compiler error. double x; double y = x+1.5; is less than optimal.


I don't find this convincing.

> What do you return for an index into the array?

An option/maybe type would solve this much better.

> Yes, I know, it can be clumsy to trace it back to its source

An exception would be much better, alerting you to the exact spot where the problem occurred.


> An option/maybe type would solve this much better.

NaN's are already an option type, although implemented in hardware. The checking comes for free.

> An exception would be much better

You can configure the FPU to cause an Invalid Operation Exception, but I personally don't find that attractive.


The missing bit is language tooling. The regular floating point API exposed by most languages don’t force handling of NaNs.

The benefit of the option type is not necessarily just the extra value, but also the fact that the API that forces you to handle the None value. It’s the difference between null and Option.

Even if the API was better, I think there’s value in expressing it as Option<FloatGuaranteedToNotBeNaN> which compiles down to using NaNs for the extra value to keep it similar to other Option specialisations and not have to remember about this special primitive type that has option built in.


Yeah. You should be very explicit about it. Certainly not treat it like, “ooh, here are some free bits that I can use to tag things in ad hoc ways (like -1 for missing index)”.

https://internals.rust-lang.org/t/pre-rfc-nonnan-type/8418


> NaN's are already an option type, although implemented in hardware

The compromise with this is that it makes it impossible to represent a non-optional float, which leads to the same issues as null pointers in c++/java/etc.

The impacts of NaN are almost certainly not as bad (in aggregate) as `null`, but it'd still be nice if more languages had ways to guarantee that certain numbers aren't NaN (e.g. with a richer set of number types).


> The impacts of NaN are almost certainly not as bad (in aggregate) as `null`, but it'd still be nice if more languages had ways to guarantee that certain numbers aren't NaN (e.g. with a richer set of number types).

The problem with that is that to guarantee arithmetic does not result in a NaN, you need to guarantee that 0 and infinity are not valid values, and those values can still arise from underflow/overflow of regular computation. Basically, there's no subset of floating-point numbers that forms a closed set under +, -, *, or / that doesn't include NaN. So you can define FiniteF32 (e.g.), but you can't really do anything with it without the result becoming a full-on float.


As far as I'm aware, there's no equivalent to a stack trace with NaN, so finding the origin of a NaN can be extremely tedious.


I've never found it to be particularly difficult.

The extremely difficult problems to find are uninitialized data and threading bugs, mainly because they appear and disappear.


Good points!


Exceptions are actually part of floats, they're called "signalling nans".

So technically Python is correct when it decided that 0.0/0.0 should raise an exception instead of just quietly returning NaN. Raising an exception is a standards-conforming option.

https://stackoverflow.com/questions/18118408/what-is-the-dif...


In practice, I've found signalling NaNs to be completely unworkable and gave up on them. The trouble is they eagerly convert to quiet NaNs, too eagerly.


I am firmly in the belief that sNaNs were a mistake in IEEE 754, and all they really serve to do is to create hard trivia questions for compiler writers.


technically I guess it should return sNAN (so app can check for it if it want to handle it differently) and raise exception if sNaN is used in (non-comparison) operation


> > What do you return for an index into the array? > An option/maybe type would solve this much better. Only if optional<float> is the same size as float.


Only if you’re working with pure enough functions, I think? Otherwise I might be polluting state with NaN along the way. And therefore need to error handle the whole way. So I may rather prefer to catch NaN up front and throw an error.

Loud and clear on the other point. I work with ROS a lot and the message definitions do not support NaN or null. So we have these magic numbers all over the place for things like “the robot does not yet know where it is.” (therefore: on the moon.)


> The result of any operation on a NaN is a NaN.

That's not true! maxNum(nan, x) = x.


The standard for comparison with NaNs is to always return false [0].

So that entirely depends on how max() is implemented. A naive implementation of max() might just as easily instead return NaN for that. Or if max(NaN, x) is x, then it may give NaN for max(x, NaN).

Note that the fact that comparisons always return false also means that sorting an array of floating point values that contain NaNs can very easily break a comparison-based sorting algorithm!! (I've seen std::sort() in C++, for example, crash because NaNs break the strict weak ordering [1] requirement.)

[0] https://en.wikipedia.org/wiki/NaN#Comparison_with_NaN

[1] https://en.cppreference.com/w/cpp/named_req/Compare


maxNum is (was - as another poster mentioned it is now deprecated) a specific function defined in the IEEE specification. This isn't a language level comparison function but a specific operation on floating points that has to behave the way I described. It is not a comparison function.

maximumMagnitude (and a few similar ones) replace it but behave similarly.



Looks like maximum and maximumMagnitude have the same relevant property (NaN is not viral).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: