The C version was ported to be the Rust one, it seems, and uses those intrinsics too. And, eventually this code will get to be a bit higher level while having the same output; those libraries are still a bit experimental though.
> The C version was ported to be the Rust one, it seems, and uses those intrinsics too.
No. Although the Rust program was initially presented to me as a "port of fastest C SIMD variant" the programmer made additional optimizations not found in the C program:
- Moving the loop from outside into "bodies_advance(..)" (SSE pipelining(?))
- Bundle intermediate variables/arrays as struct NBodySim (caching)
- Fit array-sizes within struct NBodySim to the number of bodies (caching)
Overall, good points!