Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another consideration is that the unoptimized version of the algorithm may be easier to explain and study. So he might also be optimizing for clarity.


Mostly unrelated: When I write heavily optimized code I prefer to write the stupidest, simplest thing that could possibly work first, even if I know it's too slow for the intended purpose. I'll leave the unoptimized version in the code base.

- It serves as a form of documentation for what the optimized stuff is supposed to do. I find this most beneficial when the primitives being used in the optimized code don't map well to the overarching flow of ideas (like how _mm256_maddubs_epi16 is just a vectorized 8-bit unsigned multiply if some preconditions on the inputs hold). The unoptimized code will follow the broader brush strokes of the fast implementation.

- More importantly, you can drop it into your test suite as an oracle to check that the optimized code actually behaves like it's supposed to on a wide variety of test cases. The ability to test any (small enough) input opens a door in terms of robustness.


Absolutely. Clarity of the concept is important. That's sometimes at odds with the best performance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: