I heard someone use those instructions once as examples of something compilers c...

mhh__ · on March 12, 2022

That depends whether you mean a human who knows the instructions exist or not or a human who hasn't worked out how to use shifts to do integers mul/div by 2 yet.

aidenn0 · on March 12, 2022

The proper argument was always that optimizing compilers generate better assembly than 90% of the people using them could generate, and in a fraction of the time.

However these things often get turned into stronger (or different) arguments as they pass from mouth to ear repeatedly.

Sometimes they change completely, as in "the plural of anecdote is data"

blippage · on March 12, 2022

I wanted to write a memcpy() routine for a microcontroller. I wrote a naive version where I copied from src to dst one byte at a time. You can find algorithms which are more efficient than this, which will typically copy 32 bit words at a time.

The interesting thing is, I turned on compiler optimisations. When I examined the assembled output (even though my knowledge of assembly is poor), I discovered that it had made the optimisations that you would find in a more complex C implementation. The compiler obviously thought to itself "I see what you're doing here", and put in a better version.

So the moral of the story is: your compiler is likely to be able to figure out a lot.

mhh__ · on March 12, 2022

Even ignoring the usual optimizations like using SIMD and loop unrolling to find parallelism when doing memcpy, the compiler actually has techniques for spotting certain loop idioms so it can actually replace the loop with a memcpy library call if it deems it profitable (e.g. tell it it's likely to have N>bigNumber and it'll go for a library)

cbm-vic-20 · on March 12, 2022

There are additional optimizations like using C's printf without any extra arguments, the compiler will replace that with a call to puts, which doesn't have the formatting code. You can see this in Compiler Explorer.

https://godbolt.org/z/dvdzE4M6T

ghusbands · on March 12, 2022

Quite often, that doesn't end up very efficient, because without "restrict", the result has to be identical to what it would be if it was copied byte by byte, for all possible overlaps of the two inputs.

Sprite_tm · on March 13, 2022

Lots of memcpy() implementations are still more efficient than a dumb byte-by-byte copy. They'll copy the (unaligned) head and the tail in bytes, but the bulk of the data using whatever data type and method is fastest.

samus · on March 12, 2022

> "the plural of anecdote is data"

I love that line!

KerrAvon · on March 12, 2022

I didn’t say it was a good argument.

eyelidlessness · on March 12, 2022

Me, every day, even discussing my own points.