There is something funny about stating that most "literally" implements the idea of the algorithm, when the main "idea" of the algorithm is to be quick. :)
Also, even in examples of super efficient ways, be wary. The inverse square root you are referring to is actually slower than what many CPUs can do with a single instruction nowdays.
Also, I think you are missing out on the main reason this code was written. Essentially a puzzle to see if it can be done.
In the case of a reciprocal square root estimate it probably is, particularly compared to the code given. A fast reciprocal square root estimate function usually takes less than 5 cycles - and is less than 1/4096th out, so the accuracy is better too.
Also, even in examples of super efficient ways, be wary. The inverse square root you are referring to is actually slower than what many CPUs can do with a single instruction nowdays.
Also, I think you are missing out on the main reason this code was written. Essentially a puzzle to see if it can be done.