Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> performance of general solutions without using SIMD, is good enough too, since all of which will eventually dump right down to the uops anyway, with deep pipeline, branch predictor, superscalar and speculative execution doing their magics altogether

A quick comment on this one point (personal opinion): from a hyperscalar perspective, scalar code is most certainly not enough. The energy cost from scheduling a MUL instruction is something like 10x of the actual operation it performs. It is important to amortize that cost over many elements (i.e. SIMD).





Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: