Actually scientific code has huge variation in speed.
The same high-level algorithm might be implemented inside an old, highly optimised FORTRAN program and then get transliterated into a use-once MATLAB script which runs 100 times slower because the vectorisation voodoo wasn't quite right.
In the latter case, people might not bother fixing it because the hour it would take is longer than the run-time. Or maybe the run-time cost is weeks, but the people doing it are scientists and understanding computers isn't their job.
I made the caveat that it's sufficiently optimized. I'm not talking about full loops on native Python arrays or something silly like that. If two people who know what they're doing try to optimize some scientific algorithm in a language without extra penalties (C, C++, Fortran, Julia, Go, Rust, etc.) the runtimes end up quite similar, at least within a few x. Of course someone can make it worse (there exist many bad coders in science...), but I'm not talking about that. I'm just saying that spending 2 years to play with some assembler code is not likely to give you some amazing speedup, while specializing algorithms to your problem is a clear and tested way to get something that is much more efficient.
The same high-level algorithm might be implemented inside an old, highly optimised FORTRAN program and then get transliterated into a use-once MATLAB script which runs 100 times slower because the vectorisation voodoo wasn't quite right.
In the latter case, people might not bother fixing it because the hour it would take is longer than the run-time. Or maybe the run-time cost is weeks, but the people doing it are scientists and understanding computers isn't their job.