Most instructions map 1:1. x86 instructions can encode memory operands potential...

Most instructions map 1:1. x86 instructions can encode memory operands potentially doubling the practical width but a) not all CPUs can decode them at full width and b) in practice compilers generate mem+ops instructions only for a (not insignificant ) minority of instructions.

So the apple to apple (pun intended) practical width difference is closer than it appear, still not as wide.

X86 machines usually target 5-6 wide rename, so that would become a bottleneck (not all instructions require rename of course). I expect that M1 has an 8-wide rename.

Edit: another limitation is that most x86 decoders can only decode 16 bytes at a time and many instructions can be very long, further limiting actual decode throughput.

On the converse the expectation is that most hot code will skip decode completely and is fed from the uop cache. This also saves power.