As far as I can tell, R uses BLAS for matrix operations, and Python probably doe...

olooney · on June 21, 2019

Both R and numpy use BLAS, and if both are linked to the same library, say OpenBLAS or Intel MKL, then performance is in fact almost identical for expensive operations like matrix multiplication. (R also ships with its own internal BLAS implementation, which is reliable but not very fast, and I believe is still single threaded, so the first thing you should do if you are using R and care about performance is to swap it out.)

For more sophisticated linear algebra algorithm, such as SVD, both will use typically LAPACK, and again, both will exhibit essential identical performance.

There is one important difference though: when R is compiled for 64-bit machines, it can only use 64-bit floats! While numpy can support 32 and even (through software emulation) 16 bit floats. This can halve memory usage, which in turn halves cache misses, which results in a significant speed up in cases where 64-bits of precision is not needed.

wxnx · on June 21, 2019

This is really interesting! I had always just claimed that Python was faster (see the benchmark I linked above [1]) based on personal experience. I wonder if this internal implementation has something to do with it...

[1] https://julialang.org/benchmarks/