Flattening the ROR and RRX special cases seems pretty straightforward to me as well. That's why I thought that the punch line was at the end of the post though:
ror r0, #0
gets assembled to
mov r0, r0.
I see how both are nops and assume they affect state congruently, but why is one nop encoding preferred over another? Does the architecture do something desirable when encountering the MOV incarnation vs. any other?
The point is that the binary encoding is the same for ROR and MOV, and only the disassembly of the binary is special cased: if it has a shift, it's disassembled into the ROR text with the shift otherwise it is displayed as a MOV. RRX is a special case to ROR as ROR is to MOV.
The ARM Instruction Set PDF I'm looking at only lists MOV as a real instruction -- one with a distinct opcode -- out of the above three.
It's MOV and LSL that have the same binary encoding; everything is identical with exception to the imm5 vs the hardcoded 0's. MOV and LSL share the same op2 field of '00'. ROR and RRX share the op2 field of '11'. ROR and MOV have a similar binary encoding, but the op2 field is distinctly different in this case.
As a note, I'm basing this off of the v7-A and v7-R manual.
Well, there's the problem: the v7-A and v7-R manual.
That manual uses unified assembly syntax, meaning that old ARM and Thumb are described together. Because of irregularities in Thumb, the old ARM instructions end up being described in a way that is needlessly verbose. You lose the insight into how the opcodes are actually decoded.
Look at an older manual. ARMv5 will do nicely. There, you can see that the MOV instruction is mostly described by 2 bits. (one condition code is stolen) In an even older ARM, such as ARMv4 I think, it really is just 2 bits.
Unless I'm missing something, the encodings are different at bits 6 and 7. Looking at xlogicx's post, MOV has them set to 0 while ROR has them at 1.
On a higher level, this post got me to realize that there could be ops (in this case nops) that have equivalent effects but are encoded as different instructions. Even though registers might state-change equivalently, I can see that the internal processor state might mutate differently. Maybe one nop encoding is faster, and maybe the caches get hit differently, etc. As an assembler, what reasons might there be to prefer one encoding over another?
When designing an architecture, I can imagine putting a "fast nop" in the instruction encoder that essentially just short circuits around it. Is this something that's done in practice?
ror r0, #0
gets assembled to
mov r0, r0.
I see how both are nops and assume they affect state congruently, but why is one nop encoding preferred over another? Does the architecture do something desirable when encountering the MOV incarnation vs. any other?