Que? We don't compile to a virtual machine, we compile to plain 6502 assembly. We do use 32 bytes of the zero page as a de-facto register file, but that's not at all uncommon in hand-written assembly either, it's just usually less explicit.
The prevalence of self-modifying code also heavily depends on what community of 6502 developers you're part of. It's relatively niche on platforms where most code is in ROM, e.g. most game consoles. For example, the NES only has 2KiB of RAM, but can support upwards of 512KiB of ROM, so it's downright wasteful to place code in RAM.
As for 2 and 3, LLVM-MOS lifts local variables to global memory or the zero page wherever possible, and is able to use function aliasing information to allow the globalised versions of local variables in different functions to share the same memory.
I'm not trying to take anything away from that work, but I view what is going on there as a kind of VM. It may not have an interpreter loop, but it seems closer to a VM/AOT type compilation than the code gen you expect from a "normal" compiler. Largely because so much needs to be emulated.
And my comment is less about the compiler and more about how people write modern C code. Usually under the assumption that various C abstractions are basically zero cost (because they tend to be) and translating that into efficient 6502 code is difficult. And my impression is mostly effected by 3+ decade old memories of apple][ programming which were frequently a mix of higher level interpreted code (usually using some kind of efficient bytecode via applesoft, or p-code, etc) and assembly routines. Where a big part of the battle was fitting everything into 64-128K of RAM, while keeping the speed up so being able to do 16-bit operations and the like with a 1 byte opcode picked out by an interpreter was part of the advantage vs writing assembly which called routines (or frequently parts of applesoft itself) directly. The former could get a 2x+ code side savings.
Maybe some of that is less important these days, because the platform is largely running in emulation or people are packing their machines with a lot of ram because its basically free. My (well one of them lol) IIGS has 8M of ram I added maybe 15 years ago, and I have a pile of 256K-1M RAM cards. Either way code size is maybe not as big of a deal these days because 512K of ROM is a lot easier than 64k. Nor in some ways is losing a multiple of the possible speed because no one is seriously trying to use these computers day to day. Back in the late 1980's I wrote a text editor in basic (for editing assembly code for my assembler) and then spent a lot of time optimizing pieces of it just to make it usable while still being able to edit assembly programs that were a few tens of K of source. In the end the line input ended up being entirely assembly because simple things like blinking the cursor simply took to much time in applesoft.
Thinking back, adding a linker, might have been a better use of my time, but 64k at 1Mhz put a limit on how big the resulting code could be.
The prevalence of self-modifying code also heavily depends on what community of 6502 developers you're part of. It's relatively niche on platforms where most code is in ROM, e.g. most game consoles. For example, the NES only has 2KiB of RAM, but can support upwards of 512KiB of ROM, so it's downright wasteful to place code in RAM.
As for 2 and 3, LLVM-MOS lifts local variables to global memory or the zero page wherever possible, and is able to use function aliasing information to allow the globalised versions of local variables in different functions to share the same memory.