1) They have excluded i386. It is well known that 64-bit archs waste lot of memory and 32-bit arch could probably save something
2) They should have disabled position-independent code generation because on i386 it takes more memory
3) Instead of optimizing for speed whole program it is better to optimize for speed only "hot" parts and optimize the rest for size. Or optimize everything for size.
From a few old articles, my impression was that x86-64 code is, in a fair number of cases, notably denser & faster than the i386 equivalent. Main reason why - only 8 architectural registers for the i386 code to use, vs. 16 for the x86-64 code. So the i386 CPUs can waste a lot of instructions & time shuffling data between registers and memory, because they've run out of registers.
1) The article is checking what's the case in 2022. In my opinion it's quite reasonable to not let 32bit compete at all. (though there is a 32bit RISC there)
2) do you mean also on 64bit x86? Yeah, probably would be nice to see. But then again I would say that today's code is position independent. It's not useful with choice of ISA to compare a mode you would not run anyway.
3) That's not the argument this article aims (successfully) to debunk, though.
1) i386 is not representative of bleeding edge CPU architectures (as mentioned in the article). It would only be icluded for purely academic reasons - which was out of scope.
2) ...
3) In the real world most programs use a single optimization/tuning config for the entire program.
The article aimed to analyze real world programs running on contemporary modern architectures.
Would it matter if it had better code density? "Modern" doesn't necessarily mean that it is better in every aspect. For example, there were articles claiming that same application compiled to 64 bits uses more memory than 32-bit version.
It wouldn't matter. I still would not buy an i386, compile my programs in 32-bit mode instead of 64-bit, nor use the i386 ISA as a model when designing a new ISA.
Same thing with 6502, Z80, Vax, etc.
What matters is performance, and i386 code does not give as good performance as x86_64 code or modern RISC code (it doesn't have as many GPRs etc so it can't).
1) They have excluded i386. It is well known that 64-bit archs waste lot of memory and 32-bit arch could probably save something
2) They should have disabled position-independent code generation because on i386 it takes more memory
3) Instead of optimizing for speed whole program it is better to optimize for speed only "hot" parts and optimize the rest for size. Or optimize everything for size.