That's really interesting to hear. I tried Duff's Device myself, once upon a time, and found it made no measurable difference, so I pulled it back out. This was a long time ago, so I had just assumed that a similar optimization was just built into most C compilers these days. Does it vary by toolchain? I believe I was using clang at the time.
(I was still quite wet behind the ears at the time, so it's also more than possible that I was just doing something wrong.)
All that said, agreed, wonderful little mechanism. It's one of those things that every C programmer should take the time to really understand.
Yes, modern compilers do loop unrolling nowadays, so Duff's device doesn't really have a use. But it's certainly a wonderful bit of history. I found Duff's device when I was researching coroutines in C.
No, this is just a wonderful bit of obsolete hacker history. Unless if you mess with retro-architecture for fun (like old PDP machines, be they emulated or real), then it's somewhat relevant.
The first time I saw that thing I just thought "That just can't be valid C syntax". It's quite an amazing hack that seems ugly at first but seems more and more elegant the more you think about it.
These days compilers usually do loop unrolling on their own, so the device has mostly lost its purpose, but it can apparently also be used to implement something similar to coroutines in C, which is nice.
Disgusting indeed: A way to write irreducible loops that doesn't use a goto. Good idea to measure the resulting code, given how poorly most optimizers deal with such loops.
Even sadder because the same effect can be achieved cleanly (and "optimizably")by using as "switch" followed by a "while".
It's interesting how this seems to be not quite an algorithm - I suppose "device" is an appropriate name. The "See also" section on Wikipedia leads to some good stuff.
See: https://en.wikipedia.org/wiki/Duff%27s_device