Porting Sweet 16 (2004)

HarHarVeryFunny · 2024-02-26T13:02:12 1708952532

It's a somewhat odd instruction set, but achieves the goal of code density by the register instructions being single byte (opcode + 4-bit register), and the odd pre/post-increment/decrement "load/store indirect" instructions which do things like acc = *reg++ useful for copy loops and push/pop operations.

It reminds me a bit of the Apollo Guidance Computer (AGC) which also had a specialized decrement counter and loop instruction (CCS), presumably also just motivated by code size.

Similar to having SWEET16 to augment the 6502, the AGC also used a virtual machine/interpreter to gain additional code density.

https://en.wikipedia.org/wiki/Apollo_Guidance_Computer

https://borja.medium.com/a-glimpse-into-the-apollo-guidance-...

howerj · 2024-02-26T17:10:02 1708967402

It is odd! It really likes it's load/stores and you can branch on anything, but it appears to be missing any of the bitwise operators which seems to be an odd deficiency. Perhaps that's (one of) the reasons this VM never caught on?

kragen · 2024-02-26T18:04:08 1708970648

throughout the 50s and 60s it was common for computers to not provide the bitwise operators at all; the first versions of mix (used in volume 1 of taocp) didn't have them. they take very little hardware to implement, and they're comparatively very slow if you have to implement them in software (though less so on an 8-bitter!) but most software doesn't use them at all. so it makes sense that you wouldn't include them in sweet-16; if you care about speed you'll call out from sweet-16 to 6502 code

HarHarVeryFunny · 2024-02-26T18:57:59 1708973879

> if you care about speed you'll call out from sweet-16 to 6502 code

That's another somewhat odd omission from SWEET16 - the only way it lets you combine 6502 assembler and SWEET16 code is via the pattern.

<6502 code>

jsr SWEET16 // switch to inline SWEET16 code

rtn // SWEET16 instruction to switch back to inline 6502 code

<6502 code>

You can't do the opposite - switch to 6502 code for part of a SWEET16 function, or have 6502 code call (vs switch to) a SWEET16 subroutine (or vice versa).

kragen · 2024-02-26T19:44:32 1708976672

yeah

interestingly, this form of argument passing, where the arguments follow the call instruction, was the standard way to pass arguments on the pdp-8. though usually it was just, like, a fixed number of arguments, not an arbitrary-sized blob as in the sweet-16 case

it does seem like it would have been more convenient to orchestrate low-level 6502 subroutines by stringing them together with sweet-16 code than vice versa, but i guess the actual scripting language woz wrote for the apple 2 was integer basic

HarHarVeryFunny · 2024-02-26T17:42:09 1708969329

Yeah, not clear what Woz really expected it to be used for. Apparently the only thing he used it for himself was a later Apple BASIC renumber utility!

Memory and ROM space was so valuable back then, that I wonder if in retrospect he thought that was 300 bytes (size of SWEET16 interpreter) of ROM space wisely used ?!

kragen · 2024-02-26T18:04:29 1708970669

yes, because it saved more rom space than that

HarHarVeryFunny · 2024-02-26T18:38:46 1708972726

Well, it really didn't save any ROM space since nothing in the ROM used it! Woz has since said that he thinks that the Apple BASIC interpreter (in ROM) could have been shrunk by 1KB (5KB->4KB) without any loss of performance by selective use of SWEET 16, but that never actually happened.

https://www.apple2history.org/museum-a-l/articles/byte8501/

kragen · 2024-02-26T18:53:33 1708973613

oh, thank you for the correction!

monocasa · 2024-02-26T15:05:48 1708959948

Also, by providing "registers" the same width as a pointer in contrast to normal 6502 code, you save a lot of shuffling between registers and the zero page just to indirectly access other data.

kragen · 2024-02-26T18:01:21 1708970481

the 8086 also has a specialized decrement counter and loop instruction, but instead of calling it 'ccs' it's called 'loop'

specialized loop instructions are pretty common actually

tasty_freeze · 2024-02-26T19:11:46 1708974706

The Z80 has DBNZ (decrement and branch if not zero), but the counter register (B) was only 8 bits wide.

HarHarVeryFunny · 2024-02-26T18:45:05 1708973105

I think in a large CISC instruction set it's less surprising to see specialized instructions like this, but it stands out as a rather specific design choice in something like SWEET16 that has such a minimal instruction set, and doesn't even have an indirect load/store without an increment/decrement (i.e., using C syntax, it let's you do a = * r++, but NOT a = * r).

wk_end · 2024-02-26T18:51:09 1708973469

The 68000 has it too (DBRA).

ngcc_hk · 2024-03-03T15:07:40 1709478460

“ Direct Page Register

One of the things that had made the 6502 so successful, was its extensive use of ‘zero page’, or memory locations 0 to 255. With a very limited number of registers, the 6502 relied on the use of ‘zero page’ to access frequently used values. Access to these values was faster as only a single address byte was needed to specify the address. The 6809 enhanced the use of the zero page by adding a new 8-bit ‘Direct Page’ register which allowed the ‘zero page’ to be moved to anywhere in the 16-bit address space.”

From https://thechipletter.substack.com/p/motorolas-6809-the-best...

rob74 · 2024-02-26T12:02:15 1708948935

"Sweet 16 is not a teen-magazine, nor is it a brand name for candy." (from the article)

"It is a song by Billy Idol, but that's not what this article is about" (my addition)

kevindamm · 2024-02-26T14:28:38 1708957718

It's funny that we don't say "covering a [program]" but "porting..." when recreating something in the spirit of another's work.

yjftsjthsd-h · 2024-02-26T16:03:07 1708963387

I'd say porting isn't recreating; that's cloning or reimplementing. But yes, now that you pointed out it's curious that there are different words in different areas of interest.

flippy_flops · 2024-02-26T14:31:29 1708957889

...It is actually the NCAA regional semifinals, but again - not the topic of this article.

joezydeco · 2024-02-26T15:13:11 1708960391

I've always been curious if there was any application that used SWEET16. From a cursory look it seems like it was planned to rewrite Integer BASIC with this VM, but time ran out and then Microsoft sold them the floating point BASIC.

RENUMBER from Programmer's Aid #1 seems to be the only mainstream thing that used it. Anyone know of anything else?

dwheeler · 2024-02-26T19:34:20 1708976060

It was used elsewhere. E.g.: https://www.apple2history.org/museum-a-l/articles/byte8501/ says:

> BYTE: Isn’t Sweet-16 still used in Apple DOS and ProDOS editor/assemblers?

> WOZNIAK: Yes, it’s used in EDASM , mostly in the editor portion. Randy Wigginton wrote EDASM. He’s worked here since before we even had a company. Lately he’s written the Macintosh word processor – MacWrite. He’s done a lot for the company. and he’s used Sweet-16 in several things he’s done.

At least some 6502 assemblers included Sweet-16 opcodes, so I suspect many assembler programs included Sweet-16 in the parts where speed wasn't as pressing (simply because it was easy to invoke).

ngcc_hk · 2024-02-26T17:19:53 1708967993

One of the enablers is zero page is really a register page under 6502. That helps a lot I think.

Not to mention his thinking out of the box and tool around the issue, trading space vs speed in some scenario.

Conceptually it is closed to dock as it provide an api call for some usage without a full blown vm.

renewedrebecca · 2024-02-26T18:45:33 1708973133

It's really not a register page though. There are all sorts of operations that you can do on the real registers that you can't do with a zero page address, at least not directly.

It's just a shortcut addressing mode.

throwaway81523 · 2024-02-26T17:43:28 1708969408

I thought the zero page lived in ordinary ram. It was more efficient to access than the rest of ram, just because the instruction encodings were designed such that you could refer to 0 page addresses directly. To reach arbitrary 16-bit addresses, I believe you had to use the HL registers or similar. This style of design was common on memory starved machines of the era, including the PDP-8 which predated most of the 8-bitters.

HarHarVeryFunny · 2024-02-26T23:10:50 1708989050

Zero page is just the first "page" (256 bytes) of the address space, which would map to normal RAM. The only thing that is special about zero page is that since the address "page" (high 8 bits of 16bit address) is implied (=0), you only need to use an 8bit (one byte) address to specify any zero page address.

The 6502 instruction set takes advantage of this by having additional instructions to load/store values to zero page as well as ones to load/store to generic 16 bit addresses. An instruction accessing zero page memory will be two bytes (op-code + one byte address), while an instruction accessing a generic address will be three bytes (op-code + two-byte address). The zero page access therefore takes less code space, and will run faster (even though the memory access speed is the same) since there are only 2 vs 3 instruction bytes to fetch.

Given the 10x speed overhead of the SWEET16 interpreter, it'd really not make much difference if it chose to use non-zero page locations for its 16 registers. It makes more difference in hand written code where you are optimizing for every byte of code space and cycle of instruction timing.

ngcc_hk · 2024-02-27T11:15:33 1709032533

Should have said “in practice may used like a register page” as I saw a lot of work are based on this practice.

For the “not much difference” would it be intended because it is using space to trade of low speed. Given this page use less space and less speed, would it still be important.

HarHarVeryFunny · 2024-02-27T16:03:34 1709049814

For SWEET16 the choice to place it's registers in zero page, or not, would only affect speed of access, not code size. The SWEET16 code would still be using one byte register instructions in either case.

As far as speed, given that the interpreter overhead of SWEET16 means it's about 10x slower than 6502 code, adding an extra clock cycle for non-ZP access would at worst make that 11x vs 10x. Would that really make a difference to the cases where it was useful? The trade off is that NOT using ZP for SWEET16 would give those 32 bytes of ZP (16 registered x 2 bytes) back to the developer.

However, It seems that SWEET16 was not much used, and I doubt that any change like this - where to place it's registers - would have made any difference to that. It's maybe interesting to consider would a different interpreted code (with different feature set from SWEET16) have been any more widely used?

Yes, you can consider ZP as a page of registers, but it's not like there was any choice - seeing as the 6502 basically had no registers (just A, X, Y) you had to have your program variables in memory, and ZP was faster/smaller that the alternative!

ngcc_hk · 2024-02-28T16:58:22 1709139502

Thanks for the clarification.

kragen · 2024-02-26T18:05:23 1708970723

that is all correct except for the hl registers, which were an 8080 thing, not a 6502 thing

kragen · 2024-02-26T18:00:39 1708970439

it's probably worth noting that nowadays the standard term for what this article calls a "metaprocessor", "pseudo microprocessor", and a "non-existent 16 bit processor", is "virtual machine" (which strotmann does explain in his introductory paragraph)

ngcc_hk · 2024-02-27T11:19:45 1709032785

Surprise to learn as not noted over there that pdp-8, z80 and 68000 all have similar arrangement, i.e. it has less code size for accessing the zero page or its equivalent. Is that widely used over these? Not talk as much as zero page when learning retro computing.