The time when the actual floating point instructions were a bottleneck are long gone.
Nowadays when you do computations with single float/double values in your registers they are equally fast.
The biggest difference comes from memory bandwith and the ability to vectorize. Your CPU can calculate either 4 floats or 2 doubles in one instruction (assuming pre AVX X64 processor). With AVX it's 8 floats or 4 doubles.
Nowadays when you do computations with single float/double values in your registers they are equally fast.
The biggest difference comes from memory bandwith and the ability to vectorize. Your CPU can calculate either 4 floats or 2 doubles in one instruction (assuming pre AVX X64 processor). With AVX it's 8 floats or 4 doubles.