There is a huge difference between “algorithmically efficient” (aka big o) and “real life efficient” (aka actual cycle counts). In real life, constant factors are a huge deal. Real developers don’t just work with CS, they work with the actual underlying hardware. Big O has no concept of cache, no SMP, no hypertreading, no pipeline flushes, no branch prediction, or anything else that actually matters to creating performance libraries and applications in real life.