Here's the rest of the quote: > A larger example of this sort arose in an atmosp...

Here's the rest of the quote:

> A larger example of this sort arose in an atmospheric model (a component of large climate model). While such computations are by their fundamental nature “chaotic,” so that computations will eventually depart from any benchmark standard case, nonetheless it is essential to distinguish avoidable numerical error from fundamental chaos.

> Researchers working with this atmospheric model were perplexed by the difficulty of reproducing benchmark results. Even when their code was ported from one system to another, or when the number of processors used was changed, the computed data diverged from a benchmark run after just a few days of simulated time. As a result, they could never be sure that in the process of porting their code or changing the number of processors that they did not introduce a bug into their code.

> After an in-depth analysis of this code, He and Ding found that merely by employing double-double arithmetic in two critical global summations, almost all of this numerical variability was eliminated. This permitted benchmark results to be accurately reproduced for a significantly longer time, with virtually no change in total run time [3].

The keyword is "numerical variability." Here's the referenced article: https://link.springer.com/article/10.1023/A:1008153532043 Choice quote from the article:

> In climate model simulations, for example, the initial conditions and boundary forcings can seldomly be measured more accurately than a few percent. Thus in most situations, we only require 2 decimal digits accuracy in final results. But this does not imply that 2 decimal digits accuracy arithmetic (or 6-7 bits mantissa plus exponents) can be employed during the internal intermediate calculations. In fact, double precision arithmetic is usually required.

The problem isn't lack of precision. The problem is numerical instability when adding up a bunch of numbers with high absolute values but since they were roughly evenly positive/negative, their sum was approximately 1. IEEE-754 floats, as useful as they are, are just bad at this, and adding more bits isn't a solution, it's a punt. They used Kahan summation or Bailey summation and the problem went away. No 128 bit hardware floats required. (Kahan summation is very well known, Bailey summation is new to me)

Here's my point: if double precision floating point doesn't satisfy your needs, you should dig into the problem and understand why. Understand first, write code second. 999/1000 the solution isn't "we need 128 bit floats", and for that .1%, we're waaaay better off telling those people "Sorry, do it in software and take the performance hit."