> at least for sqrt(), internally it's likely implemented as a heuristic guess S...

akoboldfrying · on Feb 1, 2025

>Square roots are implemented in hardware

But how does that hardware implementation work internally?

The point I'm trying to make is that it is probably an (in-hardware) loop that uses Newton's Method.

ETA: The point being that, although in the source code it looks like all looks have been eliminated, they really haven't been if you dig deeper.

Const-me · on Feb 1, 2025

> how does that hardware implementation work internally?

I don’t know, but based on performance difference between FP32 and FP64 square root instructions, the implementation probably produces 4-5 bits of mantissa per cycle.

defrost · on Feb 1, 2025

There are other methods used in hardware, eg (for example)

https://en.wikipedia.org/wiki/Methods_of_computing_square_ro...

Something like Heron's method is a special case of Newton's method.

akoboldfrying · on Feb 1, 2025

Interesting that your linked algorithm manages to avoid costly divisions, but it uses an even longer loop than Newton's Method -- one iteration for every 2 bits. NM converges quadratically, doubling the number of correct bits each time, so a 32-bit number won't ever need more than 5 iterations, assuming >= 1 bit of accuracy in the initial estimate, which is easy to obtain just by shifting right by half the offset of the highest 1-bit.

defrost · on Feb 1, 2025

There are trade-offs (constant time, perhaps?) and many differing applications ...

For example: Pipelined RISC DSP chips have fat (many parallel streams) "one FFT result per clock cycle" pipelines that are rock solid (no cache hits or jitter).

The setup takes a few cyces but once primed it's

aquired data -> ( pipeline ) -> processed data

every clock cycle (with a pipeline delay, of course).

In that domain hardware implementations are chosen to work well with vector calculations and with consistent capped timings.

( To be clear, I haven't looked into that specific linked algo, I'm just pointing out it's not a N.R. only world )

adgjlsfhk1 · on Feb 1, 2025

sqrt is the one exception to this. the newton series is really good and the polynomials aren't great (and the newton based approach prevents you from having to do range reduction)