Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A large part of making this work, requires basically creating a virtual machine, ALA SWEET16, which is what Woz needed to do to get parts of BASIC to work.

But, that will _never_ be as fast as hand coded 6502 assembly unless compilers/etc get a _LOT_ smarter. I came to the realization many years ago, that while the 6502 and the software written for it is amazing, its fundamentally incompatible with modern software development. Thats because in order to create high performace 6502 application three major tenents of software development _must_ be violated.

1: No self modifying code 2: Avoid global variables 3: Use structured programming rather than goto spaghetti.

On the 6502 there are provable reasons why not using self modifying code, globals (frequently on the zero page), and goto's vs function calls are slower. When looking at cycle times, sometimes its just a cycle here/there being saved, but you have to realize that saving a cycle on an instruction can frequently be anywhere from 20-50% faster just on that instruction. Sometimes one can afford to run 10x slower, but on a processor running at just a few Mhz (or 1Mhz in many cases) that can be the difference between being able to realistically solve a problem in realtime or creating a batch job.



Que? We don't compile to a virtual machine, we compile to plain 6502 assembly. We do use 32 bytes of the zero page as a de-facto register file, but that's not at all uncommon in hand-written assembly either, it's just usually less explicit.

The prevalence of self-modifying code also heavily depends on what community of 6502 developers you're part of. It's relatively niche on platforms where most code is in ROM, e.g. most game consoles. For example, the NES only has 2KiB of RAM, but can support upwards of 512KiB of ROM, so it's downright wasteful to place code in RAM.

As for 2 and 3, LLVM-MOS lifts local variables to global memory or the zero page wherever possible, and is able to use function aliasing information to allow the globalised versions of local variables in different functions to share the same memory.


I'm not trying to take anything away from that work, but I view what is going on there as a kind of VM. It may not have an interpreter loop, but it seems closer to a VM/AOT type compilation than the code gen you expect from a "normal" compiler. Largely because so much needs to be emulated.

And my comment is less about the compiler and more about how people write modern C code. Usually under the assumption that various C abstractions are basically zero cost (because they tend to be) and translating that into efficient 6502 code is difficult. And my impression is mostly effected by 3+ decade old memories of apple][ programming which were frequently a mix of higher level interpreted code (usually using some kind of efficient bytecode via applesoft, or p-code, etc) and assembly routines. Where a big part of the battle was fitting everything into 64-128K of RAM, while keeping the speed up so being able to do 16-bit operations and the like with a 1 byte opcode picked out by an interpreter was part of the advantage vs writing assembly which called routines (or frequently parts of applesoft itself) directly. The former could get a 2x+ code side savings.

Maybe some of that is less important these days, because the platform is largely running in emulation or people are packing their machines with a lot of ram because its basically free. My (well one of them lol) IIGS has 8M of ram I added maybe 15 years ago, and I have a pile of 256K-1M RAM cards. Either way code size is maybe not as big of a deal these days because 512K of ROM is a lot easier than 64k. Nor in some ways is losing a multiple of the possible speed because no one is seriously trying to use these computers day to day. Back in the late 1980's I wrote a text editor in basic (for editing assembly code for my assembler) and then spent a lot of time optimizing pieces of it just to make it usable while still being able to edit assembly programs that were a few tens of K of source. In the end the line input ended up being entirely assembly because simple things like blinking the cursor simply took to much time in applesoft.

Thinking back, adding a linker, might have been a better use of my time, but 64k at 1Mhz put a limit on how big the resulting code could be.


There's nothing wrong with self-modifying code as a compiler-implemented optimization technique. At the end of the day it's just a more powerful variant of goto; you're just directly editing the continuation of a running program.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: