I did something quite similar when I was finding worst-case inputs for complex multiplication rounding errors. Here I had to search over tuples of four floating-point values; but I could ignore factors of two (since they don't affect relative rounding errors) and I had some simple bounds on the inputs such that I had only a hundred billion tuples left to consider -- a small enough number that it took a few hours for my laptop to perform the search.
Replicating this for double-precision inputs would have been impossible, of course; but fortunately knowing that the worst-case single precision rounding error occurred in computing
> Round to nearest even is the default IEEE rounding mode. This means that 5.5 rounds to 6, and 6.5 also rounds to 6.
Wat. I had never even heard about this, but the more I read and think about it, the more this rule makes sense to me.
How did I miss this - is this commonly taught in school and I just did not pay attention? It seriously worries me that I don't even know how to round properly - I always thought of myself as a math-and-sciency guy...
I learned about this in HS chem class and it's always stuck, even if people don't use it in informal settings. Also called "unbiased rounding" or "bankers' rounding" [1]
it's rarely taught in school. Most people know that you round .5 up, but that is a slightly uneven distribution and rounding to the even number however produce a perfectly even distribution. It's also the way you round in the financial sector.
I don't believe the round-to-next-even-method will give you a perfectly even distribution. Benford's law can be generalized to the second, third, ... digit, and although the distribution is not as skewed as for the first digit [1], you'd still be rounding down a bit more often than rounding up.
Benford's law applied to rounding floats is splitting hairs, really - but it goes to show how even "simple" things like rounding can be really difficult if you only worry about them for too long.
Benfords law only applies to certain sets of numbers.
Not all numbers display Benfords law. Therefore Benfords law should not be considered when talking about rounding unless there's a reason to think that it applies to your particular set of numbers.
It would be interesting to know what sets of numbers do not follow Benford's law.
Check Knuth's discussion of it in "The Art of Computer Programming, section 4.2.4 where he talks about the distribution of floating point numbers. Benford's research took numbers from 20299 different sources. Additionally, the phenomenon is bolstered by noticing, in the pre-calculator days, that the front pages of logarithm books tended to be more worn than the back pages.
The whole discussion covers about four pages, and contains what I think of adequate mathematics demonstrating the point.
The interesting thing about Benford's law is how widely applicable it is. Most sets of numbers you come across in your life are likely to follow it, or at least have leading 1s be more common than leading 9s. As such we should definitely be considering it when trying to decide which rounding rule to adopt in the general case.
I'll try to make the parent's point a different way. This is about rounding. That's about the digits with the least significance. A good rounding scheme wants to see a uniform distribution there in the two least significant digits. Benford's law breaks down after four digits pretty much completely. That's the over-thinking it aspect.
Financial sector of many (all?) countries in UE is required to round all .5 up. If you use banker's rounding, you might get into trouble with the tax office. So always check the local regulations for the rounding mode.
Percentages is the best example of why it's so useful. For value A at 41.5% and value B at 58.5% their sum is equal to 100%. Rounding up would make them 42% and 59% respectively, thus 101% in total. Round half to even ensures the final values of 42% and 58% sum to 100% as intended.
I did some work for my third year dissertation on performing exhaustive testing on floats. Worked well enough, but I was never convinced that it was an especially effective way of testing.
If your test oracle is code itself, then it's not unlikely you'll make the same bug twice. I can't find the paper I'm after, but NASA did some work here.
If your test oracle isn't code, you have the problem of working out the correct outputs for the entire function space (at which point maybe you'd be better off with some kind of look-up table anyway)
The next problem is your input combinations grow much faster than you are able to test. Most functions don't just work on one float (which is very feasible) but they might work on three or four streams of input, and keep internal state.
The moment you have these combinations you run into massive problems of test execution time, even if you can parallelise really well.
Butler and Finelli did some work http://www.cs.stonybrook.edu/~tashbook/fall2009/cse308/butle... also at NASA investigating if doing a sample of the input space was worthwhile, but their conclusion was that it isn't helpful, and that any statistical verification is infeasible.
In my report I ended up concluding that it isn't a particularly useful approach. It seems that following good engineering practice has better returns on your investment.
The problem with this is that there are a select few cases where a machine can (or should) determine the expected value for a test assertion.
- My function does something complicated: chances are you will be replicating your function in your test code. Hand-code assertions.
- My function can be tested using a function that is provided by the runtime/framework: great! Remove your function and use the one in the framework/runtime instead.
- My function can be tested using a function that is provided by the runtime/framework, but is written for another platform or has some performance trick: test as described in the article.
- My function should never return NaNs or infinities: test as described in the article.
> And, you can test every float bit-pattern (all four billion!) in about ninety seconds. [...] A trillion tests can complete in a reasonable amount of time, and it should catch most problems.
Your developers will never run said tests. Unit tests are supposed to be fast. However, these exhaustive tests could be run on the CI server.
The code in the article is about optimisation, in which case there should always be a slow reference implementation. Never try to write highly optimised code on the first go: make it work, make it correct, then make it fast.
Also, these aren't unit tests; they're property tests.
> The code in the article is about optimization [...]
Very true but that really should have been stated up-front by the author. Next thing you know "some guy at Valve said all unit tests with floats should test all 4 billion cases." You would be surprised at how much people take away from a skim over something - e.g. a poor understanding of Apps Hungarian created the monster that is Systems Hungarian.
> in which case there should always be a slow reference implementation.
Good point - however, keep in mind that bug fixes would need to be made in two places. Not only does it mean the maintainer needs to grok two pieces of code, but they may have to determine if the original code is providing bad truths. Hopefully there wouldn't be many places where this type of testing would be required as the low hanging fruit is often enough: however keep in mind certain industries do optimize to ridiculous degrees - and in some of those industries optimization can happen first depending on the developers [well-seasoned] intuition, and can be a frequent habit. The author's industry is a brilliant example of where the typical order of optimization (make it work, make it correct, make it fast) takes a back seat - life is incredibly different at 60/120Hz.
> Good point - however, keep in mind that bug fixes would need to be made in two places.
Only if the bug actually occurs in both implementations. This does happen sometimes, so it's indeed good to keep in mind - but my experience experimenting with these kinds of comparison tests hasn't borne this out as the more common case. YMMV, of course.
> Not only does it mean the maintainer needs to grok two pieces of code, but they may have to determine if the original code is providing bad truths.
I find automated comparison testing of multiple implementations to be extremely helpful in grokking, documenting, and writing more explicit tests of corner cases by way of discovering the ones I've forgotten, frequently making this easier as a whole, even with the doubled function count.
> Hopefully there wouldn't be many places where this type of testing would be required as the low hanging fruit is often enough: however keep in mind certain industries do optimize to ridiculous degrees - and in some of those industries optimization can happen first depending on the developers [well-seasoned] intuition, and can be a frequent habit.
I don't trust anyone who relies on intuition in lieu of constantly profiling and measuring in this sea of constantly changing hardware design - I've yet to see those who've honed worthwhile intuitions drop the habit :). Even in the lower hanging fruit baskets, it's frequently less a matter of optimizing the obvious and more a matter of finding where someone did something silly with O(n^scary)... needles in the haystack and where you'd least expect it.
That said, there are times when making it correct just doesn't matter. Sacrificing accuracy for speed can be an extremely powerful technique, as long as perfect accuracy isn't required (e.g. some graphics applications).
In some cases, yes. In others, you can only manage that on a micro-level.
An example that comes to mind: at some point I was working on an optimized paintbrush tool that was to give perceptually similar results to drawing the brush image repeatedly along a path. I replaced the individual draws with convolution, but there was no way to make this exactly correspond with the "over" compositing operator that it was replacing. So I used a post-processing step to adjust it.
In this case, the divergence from the reference implementation was so great that you couldn't really compare them. But the results looked good, and the performance gains were tremendous. The only way to compare it would have been to create a new reference implementation that imitated the new algorithm, which would have told me very little. Instead, I did lots and lots of real world testing to ensure there were no missed edge cases.
An opposition to "test everything" is to play telecom signal to noise ratio type games. And error rates. So attempting a 1 part in 2 ^ 64 implies your memory access error rates and disk error rates are already far under 1 in 2 ^ 64 which is quite expensive. There are of course solutions to those problems, but they may be more expensive and complicated than you want.
With non-ECC memory and a simplistic brute force algo it doesn't make much sense to talk about doing 2^64 tests because any errors will swamp the memory errors that will be detected. Depending on who's numbers you select (1e-10? 1e-15?) to get the predetermined result you want (LOL)
At 1 in 2 ^ 32 range its commodity desktop hardware time. To pull off 1 in 2 ^ 64 hardware error rates is way harder.
Are there any estimates on how much on-demand capacity Amazon typically has? If I tried to spin up instances as fast as the API will let me, how long until I run out of quickly provisioned instances? Can I get thousands on short notice? Tens of thousands? Millions?
You would need to ask them very nicely ahead of time for permission to spin up that many instances. The default limits are rather a lot lower than that. 20 instances per global region. Going to need a fair few more than that.
Actually, 2^53 different mantissas, plus a few different exponents, depending on the function under test. Also, there's a huge number of NaNs, all of which are equivalent. For ceil/floor, it wouldn't make sense to test exponents larger than 54.
There are a huge number of double-precision NaNs in the absolute sense -- there are 2^54-2 of them. But they are a small portion of all of the doubles (roughly 1/2048) so omitting them from the tests does not help significantly with saving time.
And, skipping testing of large numbers and NaNs can lead to missing bugs. One of the fixed ceil functions handled everything except NaNs correctly. Implementations that failed on exponents beyond 63 or 64 are easy to imagine.
But, testing doubles requires compromises. Testing of special values plus random testing plus exhaustive testing of the smallest ten trillion or so numbers should be sufficient -- better than the current status quo at least.
Floats are a quantity and a magnitude. Adding two floats requires (at least) the larger of the two magnitudes because larger values are more important than smaller ones. So the smaller float essentially gets converted to a small quantity and a large magnitude. In an extreme case, this small quantity becomes zero. So when adding a very large number and a very small number, the result will be the same as adding zero to the very large number.
If you add numbers largest to smallest, then thanks to this, at a certain point you are simply doing $SUM = $SUM + 0 over and over.
But if you start with the smallest numbers, they have a chance to add up to a value with a magnitude high enough to survive being added to the largest numbers.
Your comment makes me feel sad. It just reflects a whole mindset that makes me feel like an oppressed minority somehow.
It sucks that most people don't really care about performance any more. It sucks when people causally dismiss people who do care about performance with arguments like "premature optimization" and so on.
It kind of sucks that the web is taking over because it gives an excuse to use slow languages. It almost makes me feel sick to think that Javascript stores all numbers as doubles.
So you know what? It makes me feel happy to use the most efficient type that will work. I like my single precision floats.
I can die happy knowing that I didn't waste everyone's collective network, disks and CPUs with needless processing. Can't it be good to save just for the sake of saving?
So you want a practical reason do you?
I shaved over 2 gigabytes off the size off the download size of our product simply by using more efficient data types. Using half size floats is good enough for some data (16 bits).
And let me tell you about a data type that we used to use called a char. 8 bits would you believe! Turns out those x86 CPUs that are sitting there a few VMs deep still support those things.
> It sucks that most people don't really care about performance any more. It sucks when people causally dismiss people who do care about performance with arguments like "premature optimization" and so on.
Why? Really. To my mind it's good that we no longer have to care about these implementation details, and can concentrate on writing code that expresses what we mean. Like writing on paper rather than scratching letters into tree bark.
What are these other things? This idea that other stuff is more interesting comes up on xkcd often enough.
I have a hard time finding programs that don't push the computer interesting - media programs and so on interest me more than anything else. There's the top 1% of programs that people actually use and then there's a huge wasteland of stuff nobody uses. This top 1% usually have a lot of attention paid to their performance.
I can think of websites that don't really demand computer power that are interesting but only as businesses - not as programs themselves. And they have performance requirements in the background.
The realm of interesting things that are not squeezing as much performance as possible out of your code is enormous.
There's UX to be improved, there's robots to be built, and there's an almost limitless trove of science that could be applied to everyday life, or to business, or to art. There's improving human input methods. There are whole new genres of games to be invented. There's research into algorithms, where the objective is measured in big-O, not milliseconds, because we won't have hardware to make them practical for another six years.
Sometimes, you need micro-optimized code to do theses things well. Often you don't.
Don't get me wrong: I love digging into a hot spot and optimizing the hell out of it. But I'm going to have to agree with lmm that it's a good thing that we have the option of whether or not to think about it.
And if a super smart compiler were released tomorrow that could take any program you threw at it and give you the absolute optimal implementation, the part of me that would be sad for the loss of that intellectual pursuit would be much smaller than the part of me that would be crazy excited. Of course, there would still be lossy optimization to explore, but even if it could magically solve that, I'd still never run out of interesting problems to play with.
Oh don't get me wrong. I'm not saying it's a bad thing that there's programs you can write where you don't need to care about performance. I am saying that they aren't really that interesting to me.
All of the things you suggest seem to me to have important performance requirements, maybe not as hard as micro-optimized, but still they are there.
The few examples I can think of where the performance of code of popular products is not up to scratch - it's from users complaining - and a better performing product would have a competitive advantage.
I don't think that people who go after delivering the better performing software should be derided as being dullards pursuing something less interesting.
>All of the things you suggest seem to me to have important performance requirements, maybe not as hard as micro-optimized, but still they are there.
Sure, but in most of them, those performance requirements can be met simply by following a reasonably idiomatic style in any of a number of popular high level languages.
But I don't see anyone deriding people who find performance tuning interesting as dullards. You find that more interesting than other problems, while many people find the other problems more interesting.
We currently have plenty of use for both kinds of interest, and will for the foreseeable future. Compilers that completely magick away the need for optimization are a long way off, and there are plenty of applications where performance is paramount. But there are plenty of other applications where the performance you get from using high level tools is good enough, and choices like "float" or "double" are pretty trivial.
And high level tools are allowing more and more for you to make different choices when it really is important. In JS, for example, you have things like asm.js, and if you need the efficiency of single precision floats, you have Float32Array. Soon, we'll even have abstractions for vector instructions in the browser. And of course, if you really need extra horsepower for something on the server side, you can always pipe in and out of a carefully tuned native program.
But even if your end goal is efficiency, it actually makes sense for a language to use a less efficient default so that you don't have to think about things like precision requirements. The time you save on the 95% of that code that takes up 10% of your resources is time you can spend optimizing the 5% of the code that takes up the other 90%.
I think we agree if a reasonably idiomatic style gives the necessary performance then it should be pursued, not dismissed.
I agree with what you write, what I disagree with is this above:
> It sucks that most people don't really care about performance any more. It sucks when people causally dismiss people who do care about performance with arguments like "premature optimization" and so on.
"Why? Really. To my mind it's good" etc etc
I also disagree with the "most of us" and "more interesting" angle, also exhibited here:
https://xkcd.com/409/
I think a dismissive attitude to performance is a poor attitude, I'm not saying that high level tools should not be used.
I don't know if most people find it more interesting or not. I do think there's value in both interests.
I also don't think that arguments against premature optimization are a dismissive attitude toward performance. Avoiding premature optimization is about picking your battles. In an ideal world, you could optimize everything.
But optimized code takes longer to write, and longer to maintain, and in this subideal world, we have limited time to spend on code. The point of avoiding premature optimization isn't to excuse low performance code. It's to give you time to optimize the code that makes the biggest difference.
My first job was a system for building interactive SMS services, balancing power and expressiveness with user-friendliness, much like designing a programming language. But since the mobile phone system itself adds so much latency, performance was rarely an issue.
My third job was an insurance trading platform. Most of the interesting stuff was around the representation of contracts, making it possible to manipulate them by computer, and compare different possibilities, but still having them look like the paper contracts users were used to.
My second job was at a music streaming service, and was mostly about doing simple things in ways that performed well, or scaled up to millions of users. I found this a lot less interesting or creative than my first or third job. Of course, that could just be personal taste.
OK so I think I'm not alone in saying most people would find your second job more interesting than the first or third. Certainly Joe public would. I guess I'm more interested in the end result than the journey to get there.
> It almost makes me feel sick to think that Javascript stores all numbers as doubles.
That's not actually true. JS engines are highly optimized. For example, anything that fits in a 32-bit int (and is an integer, of course) is stored as such, at least in V8.
It's not even about performance. It's about actively thinking about what data you'd be processing. Why use doubles when singles are okay? I'd sure not be using 64 bit ints just because "it's bigger" when a normal 32 bit would be just fine.
but you didn't consider the other case, which imho, is more common - where you thought 1 char ought to be enough for anybody...until it isn't, and it's impossible to make it into a 32bit integer...
Only when requirements have changed significantly, have I had to change the datatype. Either the input resolution are increased significantly, or the amount of data is several magnitudes larger than original specified, or higher precision is required. I tend to err on the side of selecting something larger if I'm uncertain.
Consider a database: If expected data amounts to 10 inserts pr second, and the application lifetime is expected to be 10 years, 32-bit is enough to index. However, there is only room for 36% growth, so 64-bit it is, unless performance is important. In that case watch the index and project when it will run out of space.
The point is, you consider the cases, and make an educated choice, this means reasoning about it both ways.
Because of rounding issues, that look like a no problem for a single precision number, but will be a no problem^2 for a double precision one. Thus, if I'm wrong (or get wrong after a change in scope), the program will still work.
Sorry for making you feel sad. Good points all around, here, and in the other comments about GPUs. Regarding CPU, though, my understanding is that on a 64 bit machine, 32 bit arithmetic is no faster than 64 bit arithmetic. Is that incorrect?
Take SSE2 instructions as an example (Every X64 processor supports these, the old x87 FPU has been obsolete for over 10 years). You can either calculate 4 floats per instruction or 2 doubles per instruction. Similar considerations also apply to ARM.
Also even if you might want to use doubles as intermediate representation you still might wish to use floats as storage. E.g read from float, convert into double, calculate stuff, write back a float.
Some work I was doing over the summer indicates that the optimal datatype depends on the algorithm you're using.
qsort() can get 15% better performance with uint64_ts than uint32_ts (sorting identical arrays of 8 bit numbers represented differently).
On the other hand, a naive bubble sort implementation was managing 5% better performance with 16 bit datatypes over any of the others.
If you start measuring energy usage as well, it becomes even stranger - using a different datatype can make it run 5% faster, while using 10%+ less energy (or in the qsort() case, take less time, but use more energy).
qsort strikes me as a bit of a crappy benchmark because of the type erasure. Your compiler likely isn't inlining or applying vectorisation. Vectorisation would likely benefit 32bit floats on a 64bit platform more than doubles.
I'd tested a number of sort algorithms (bubble, insertion, quick, merge, counting), so also testing the sort function from stdlib seemed like a logical continuation.
It was done rather quickly, so there wasn't time to properly investigate, merely look at the numbers and go "that's strange".
I'm not sure. This was a fairly quick experiment on an Ivy Bridge processor where the actual results were of secondary concern at the time (it was proof-of-concepting a separate tool).
If you look on arXiv.org, there are publications looking more at alignment on Cortex and XCore architectures. There's probably work relating to x86, though the longer instruction pipeline makes it more complicated to reason about than the simpler architectures.
In engineering applications, it's perfectly sensible to use float for world modelling, which give you precision of about 1 in 10 million. Then you can use double for intermediate results and lose less precision in a long calculation, producing a more accurate result (rounded to float) in the end.
Also, if the dynamic range of your data allows it [and for physical application it usually does], you can first rescale it to [0,1] in order to further increase precision. [It's the interval where floats are most dense.]
Having worked for a few years now in computational geometry, i'm more and more convinced that "float by default" is the way to go. Doubles should be used for intermediate computations.
Amateur. Scale to [0,FLT_MIN*2]. There they are truly most dense, and the density is uniform across the entire range. It's almost like using 24-bit integers.
Depends on the application. If I'm processing audio in a PBX, 64-bit floating point precision isn't going to do me much good when my source is a noisy 8-bit µ-law stream. All I'd gain is twice the demand for memory space and bandwidth—not a cost I'm willing to pay when I gain zero developer productivity from it.
Even CD-quality audio only has 16 bits per sample. My digital camera only records 12 bits per pixel, and that's before the image is converted to a more reasonable 8-bit-per-channel format.
Single precision is still overkill in a number of applications.
When doing arithmetics you can lose as much as 1 bit of precision per floating point operation. So, it pays off to do use more precision for calculations even when the storage format doesn't persist it.
Besides I remember reading somewhere that 32-bit floats instructions aren't actually any faster than 64-bit (except of course the cache...). They even might be slower because the FPU has to extend them to 64-bit before performing the operation
The time when the actual floating point instructions were a bottleneck are long gone.
Nowadays when you do computations with single float/double values in your registers they are equally fast.
The biggest difference comes from memory bandwith and the ability to vectorize. Your CPU can calculate either 4 floats or 2 doubles in one instruction (assuming pre AVX X64 processor). With AVX it's 8 floats or 4 doubles.
To be clear: are you talking FPU instructions, or more modern SSE(N) instruction sets? Modern compilers on modern CPUs pretty aggressively vectorize my number crunching.
If it's something you're worried about, I'd suggest measuring the difference in your actual application, both in terms of performance and accuracy. If I can come up with a decent test case I can legally publish, I'll post results, but your mileage may still vary.
True about the memory and bus bandwidth, but don't forget the battery draw and/or device cost, especially when doing DSP / SDR / software modems / stuff like that. And latency issues.
So I could use a chip that costs more, results in shorter battery life, more thermal problems, and is slower so smaller BW processed and less audio filtering possible. But, at least I'm using doubles instead of floats, which doesn't even matter when it results in no measurable signal metric improvement while every other measurable performance and economic metric tanks downward.
You can do a lot of DSP and plain ole signal processing with fixed point of course, and sometimes doing weird things requires doubles.
Anyone who wants to do fastest possible floating point arithmetic with graphics card. Also, if the memory, disk access, or IO-troughtput is the bottleneck with number crunching, single floats might give huge advantage.
We do all our image processing using floats (as memory and disk storage - not as computation accumulators obviously). It helps a lot if you keep a dozen seventy-megapixel images in memory, because it means you can run twice as many jobs on your cluster.
Yeah, I was going to say. The wisdom that I'm being taught in school is "Don't use single precision floats, at least not when you're on any kind of a modern system."
If you don't care about precision use single floats. That's most of the time in games. Some people even used ints as fixed precision fractions.
If you care about precision - doubles still aren't precise - for currency use bigdecimals, for some stuff ratios can give precise results, for irrational things your only options are doubles, but be aware that you aren't precise, even for simply looking operations. No way to compute 10% of something precisely with doubles.
most of the time it doesn't work very far. Very quickly you'll need some kind of truncated polynomial development or some Newton to compute a value.
Last time I hit dead water very quickly, computing the length of a non-trivial curve is only doable in numerical world (that one was funny, because they expressed it as a quite complicated closed form of functions, and those functions had no closed form themselves).
You can, but:
- it's still advisable to wrap ints in some class to be able to distinguish ints that are already multipied and ints that aren't. It's also better cause then you can overload operators. I've written graphic demu using fixed point numbers (floats) without wrapping them, and it was very painful and error prone. So you still work on objects and not on primitive values.
- there's a problem with how much precision you need. Different countries specify their way to round up doing money calculation differently (even with the same currencies - see Euro). When you use ints you have the assumption about precision repeated all over your codebase, which make it harder to change when tax law changes or you want to support other countries. With BigDecimals most of the code is universal.
For many financial applications, you know precisely in advance how many decimal points you need to be able to accommodate. E.g. for UK VAT calculations, you are never required to work it out to more than four decimal points (and can then round to 3).
I sort-of agree with you, in that it is easy to get bitten, and that if people have a dynamic integer type available on their platform that's probably safest. If the choice is between using integers for fixed point vs. using double's though, I'd go for the fixed point any day.
Like all good wisdom, there are plenty of exceptions. I use single precision all the time. The most obvious advantage to me is that I can load twice as large data sets into memory, which has some definitive performance advantages.
That's OK wisdom unless you are aiming for high performance. I'd put it at about the same level as "Don't use C if something higher level will work." If you are hoping to gain from vectorization, or if you have enough of them to care about memory usage, single precision is twice as good, unless the extra precision of a double actually gains you something.
Once again, I fell into the trap of saying "you should often do x" while leaving off the "... Except when you shouldn't". The exact comment from the CS professor I have in mind was "Using doubles when doing realtime human facing calculations (such as a calculator) is the best way to go. However, floats have their uses, such as on systems where there isn't enough memory, or where a float fulfills the requirements."
but that's a bit much to type from a phone so I went for "use doubles". And now I'll never be so lazy with my comments again!
Oh wow. Just about anyone in game development :) - lol... lol since I've had the exact opposite arguments with my own folks - but it wasn't for in game use, but rather game tools. But I think I was wrong at the end. Storing all model geometry vertices in floats would be too much of a loss, also stick to 16-bit vertex indices (instead of 32-bit).
In many applications, single is good enough. If you have a mass of values that needs to be stored somewhere, and the accuracy of single precision is fine, then single precision it should be.
If there is a long chain of calculations that use those numbers, you might do the intermediate calculations in double, or even long double.
Remember, the determinant value is not the rounded value of the exact computation with those things unless it is is zero, it's still "sloppy" if it's far from zero. You get only 1 bit of guaranteed accuracy in a 64bits FP number if the result is not zero.
I think there is a little bit to "sloppiness" in the fact that we carry very inexact results in FP. Even "good enough" FP computation is PhD level.
Replicating this for double-precision inputs would have been impossible, of course; but fortunately knowing that the worst-case single precision rounding error occurred in computing
made it easy to anticipate what the worst cases would be in general -- and as usual, knowing the answer made constructing a proof dramatically easier.