Most of the time you can increase performance either by increasing clock frequen...

staticassertion · on May 7, 2022

Presumably the cost here is that your instructions are considerably larger, which means fitting fewer of them into cache?

danachow · on May 7, 2022

The code density of ARM64 is not that much worse than x64 - especially for anything generated by a modern compiler. You may get some small scale gains for hand tuned code with careful instruction and register selection (ie where Rex prefix can be more easily avoided) - but average binary density doesn’t overcome the aforementioned differences in efficiency.