Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The general rule to follow in power consumption on CPUs is to do your work quickly and then get to sleep. Propagating clock is going to eat the bulk of your power. The mild difference between multiply and add in actual usage is inside the noise (orders of magnitude smaller). The bigger penalty in this case is the inter-iteration dependency, which, vectorized or not, runs the risk of holding up the whole show due to pipelining.

As a performance rule on modern processors: avoid using the result of a calculation as long as you reasonably can (in tight loops... You don't want to be out of cache.).

Have fun threading the needle!



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: