The difficult thing is to benchmark the software correctly and evaluate the impact of a change. Most of the examples on the Internet are useless micro-optimizations. I evaluated a program some time ago that did several speed tricks. But the reason it was slow was because it reread a file on each iteration in a loop.