Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

x86 is the worst ISA. If you want to play with assembler without feeling a desire to stab yourself and end it all, I recommend ARM.

Or go learn Z80, x86's weird, 8-bit cousin (it had a 16-bit version, but it sold poorly), which had a greater emphasis on backwards compatability (you can run code from the original 8080 on a Z80, unchanged), and is nicer to work with (because it wasn't extended in unticipated directions far beyond its original capabilities, while keeping fetishistic backwards compatability by stacking hack on top of hack on top hack. It also didn't have memory segmentation, otherwise known as The Worst Thing.)

There are only two common reasons to learn Z80 assembler, though: to program the Gameboy (which runs on a modified Z80 with all the cool instructions removed), and to program a TI calculator, thus making all highschoolers in your area immensely happy.

TI calculators are a comically overpriced scam, that have only survived because of the College Board, but that's another story.



This is about learning to read, not learning to write.

If you're ever in a situation where you need to read assembly, it's typically not up to you what ISA it'll be in. There are two situations where I've had to read assembly: either reverse-engineering a compiled binary or trying to understand the compiler output for a small piece of a program I'm working on in a higher-level language.


Oh. Well then, yeah.

Besides, you don't have to worry about the worst of x86 in those cases.

I'm pretty sure everyone has to learn at least that much x86 at some point, like it or not.


I'm not sure everyone has to learn even that much. I've managed to avoid it until very recently and I'm more willing to dive into underlying code than most people I know (I've contributed to a FreeBSD kernel patch in TCP and an OpenSSL key exchange interoperability fix)


I haven't jumped into the kernel yet.

But I'm only 15. I still have time.


This is not a piss contest.


Never said it was.

Actually, it was more along the lines of that being a thing I should do at some point.


I don't think that's true. I work at a major software company and certainly not all of my coworkers are interested enough in low-level stuff to ever need to read assembly.

It's a good skill to have, and since not everyone has it you will often be able to solve problems that no one else can.


My use case is inspecting a backtrace from a core dump.


> x86 is the worst ISA.

x86 is clearly not a beautiful ISA, but it is not as black as it is painted. The first thing that one should understand is large parts of the encoding of the instructions make a lot more sense once one writes them down in octal instead of hexadecimal (something even the Intel people who wrote the reference manual seem to have missed):

> http://www.dabo.de/ccc99/www.camp.ccc.de/radio/help.txt


...that really doesn't help with the issues I have: the instruction set is an absolutely ugly jumble of almost 40 years of religious backwards compatability, odd hacks, and various extensions.


> an absolutely ugly jumble of [...] religious backwards compatability, odd hacks, and various extensions

Like POSIX, MS-Windows, X11, WinAPI, C++, HTML5, ... ;-)


Hey, I didn't say it was the only one.

Although POSIX and HTML5 hold up a bit better than Win32 and C++, IMHO. Especially POSIX (it's not great, but it works pretty well).

But I've heard X11 is absolutely miserable, so the windows folks don't have a monopoly on satanically evil APIs with religious backwards-compatability.


It helps to have actual real life experience under your belt when making such claims. You seem to be parroting what countless rants have already repeated without much content.


That stung.

But fair enough.

Maybe I do need to do more research. Maybe I need to try new things.

I like to think I don't senselessly parrot, but that doesn't mean that I'm right.


The GP didn't have much content either, merely listing off other obsessively backwards compatible things. Your reply might be suitable in a formal debate setting (as would "fallacy!" claims be suitable when challenging faulty deductive logic) but this isn't a debate, it's a conversation. The source of one's claims doesn't matter, you only know they're probably not from experience because the Parent was kind enough to share their age. The claims are used to drive the conversation and establish a shared context for further conversation (or at least ranting about the shape of our industry), not to debate.

Also reading the hard-won experience of others is much more efficient than trying to get it yourself. Books are wonderful. With enough books you can advance beyond the authors without having to tread the same paths, you parrot their findings as a base for your own. Or another case, you can at least be aware of common pitfalls your senior coworkers are constantly falling into because of an aversion to reading. Why do you know the pitfall is there? You're just parroting back what a book said. That doesn't make it untrue, or not useful to know, or not useful to share with other people, or even not useful to bring up to show there's a shared context.


The source of one's claims doesn't matter, you only know they're probably not from experience because the Parent was kind enough to share their age.

I coded my first assembler program (a link relocation routine) when I was 13, so age is pretty meaningless in the context of assembler coding.


Given, I'm fairly weak in assembler (I've been learning), so I may not be the most reliable resource: GP has a point about my lack of experience, just not for the reason they think they do.


How many times must common knowledge be supported by evidence? Every time? This has been beaten to death.


Probably because not breaking programs registers a lot higher on most people's priority list than making it easier to write -- especially when we're talking about assembly, which hardly anyone writes in the first place.


True enough. And this is the real reason x86 won.


Everything you wrote is actually correct, so I'm really not sure why you're being voted down, but it's a behavioral pattern I've noticed on HN in general: write anything that's not high praise or in any way disagrees with what's popular and expect to be brutally censored. It really has me contemplating ditching HN altogether, if all we're ever going to do here is stroke eachothers' egos and pander to popular trends. And here I thought the point of HN was stimulating discussion.


It's not the worst. There are far worse places. And this sort of problem appears in all forums, but especially voting-based systems like HN (the "internet points" problem).

As forums go, HN is far from the worst. But it could be better.


In my day I wrote a fair amount of x86 assembly language. I found it fairly easy and straight forward. There are probably even less reasons to learn it these days but nothing gives you a better idea of how a computer works (arguably, besides microcode, but that is a different kettle of fish).


The simple stuff's alright, but if you want to do anything performant, it gets hairy pretty fast, as you wade through almost 40 years of expansions and ugly hacks.


Could you elaborate in which situation you faced this issue?


Trying to learn it, at least beyond the basics.

Maybe I'm crazy, and I'm definitely inexperienced. So maybe take my opinion with a grain of salt?

But yes, so far as I've seen, trying to understand x86 in its present state is a painful experience.


It also didn't have memory segmentation, otherwise known as The Worst Thing.

The examples in the post are clearly x86-64 running in 64 bit mode. I.e. it's running in flat model... there is no segmentation to worry about.


The segment registers fs and gs still exist in x86-64 and are used.

Under Windows gs (under x86-32 fs) points to the Thread Information Block (TIB):

> https://en.wikipedia.org/wiki/Win32_Thread_Information_Block

Under Linux gs is used for thread-local storage (TLS).

But "as a typical programmer" indeed you only have to worry about the internal details of this if you are an OS developer, otherwise you can simply use the appropriate segment override prefix and otherwise don't care about it.


Google's Native Client uses segmentation to enforce sandboxing.

http://static.googleusercontent.com/media/research.google.co...


Oh. Handy. I didn't know that.

Stupid outdated resources...


I never felt more like stabbing myself than when trying to cipher out exactly which immediate values are possible on ARM, and which are not. X86 I happen to enjoy. It is not "the worst ISA" by any means. It has wonderful code density, which turns out the be very important. There's a reason that x86 won and continues to win.


> I never felt more like stabbing myself than when trying to cipher out exactly which immediate values are possible on ARM, and which are not.

It depends if you're compiling for ARM or Thumb. The main rules are:

ARM and 32-bit Thumb instructions: 12 bits for arithmetic data processing instructions and load/store, 8bits with even rotate right for bitwise data processing instructions + MOV and MVN

16-bit Thumb: 3 bits for arithmetic instructions with Rd != Rn, 5 bits for shifts (so the whole range is covered) and load / store (but shifted left by the size of the data, i.e. the maximum offset for this version of LDR is (0x1F << 2) = 124), 8 bits for arithmetic instructions with Rd == Rn, SP-relative loads / stores and literal loads.

I doubt x86's popularity has much to do with its instruction set at this point. In particular, the variable length instructions are a pain for decoders (both hardware and software).


When originally invented the x86 instruction set was efficient - the most-used instructions had shorter byte code sequences. But eventually some instructions got 'left behind' by the compilers. There are a whole host of single-byte instructions that are never, ever used by a compiler - the register exchange instructions for instance (xchg eax, ebx). Compilers just schedule destination registers carefully, never need to swap them around.

Also the whole set of exchange-register-with-itself were defined but never used. E.g. xchg ax,ax which does nothing in one byte. In fact that one was considered useful, its used as the 'no-op' instruction (0x90) right? But what about xchg bx,bx, xchg cx,cx and so on? Just wasted single-byte opcodes. Leaving actual common instructions to use longer bytecode sequences.

So maybe an executable should begin with an opcode-decode-table that is loaded with the code, that tells the hardware what byte sequences mean what instructions. So each executable code can be essentially compressed, using optimum coding for exactly the instructions that code uses most often. Just thinking out loud.


> E.g. xchg ax,ax which does nothing in one byte.

Luckily in x86-16

  xchg ax,ax
simply is encoded as 0x90, which is the same as nop (the same holds for

  xchg eax, eax
in x86-32).

> But what about xchg bx,bx, xchg cx,cx and so on?

This (cleverly?) cannot be encoded in one byte. Here you have to use at least two bytes (0x87 followed by the ModR/M byte in x86-16; the same holds for their 32 bit counterparts xchg ebx, ebx; xchg ecx, ecx etc. in x86-32):

> http://x86.renejeschke.de/html/file_module_x86_id_328.html

---

ADDITION:

> So maybe an executable should begin with an opcode-decode-table that is loaded with the code, that tells the hardware what byte sequences mean what instructions.

The engineers of the Transmeta Crusoe/Efficion processor tried something similar:

> https://en.wikipedia.org/w/index.php?title=Transmeta_Crusoe&...

"Crusoe was notable for its method of achieving x86 compatibility. Instead of the instruction set architecture being implemented in hardware, or translated by specialized hardware, the Crusoe runs a software abstraction layer, or a virtual machine, known as the Code Morphing Software (CMS). The CMS translates machine code instructions received from programs into native instructions for the microprocessor. In this way, the Crusoe can emulate other instruction set architectures (ISAs).

This is used to allow the microprocessors to emulate the Intel x86 instruction set. In theory, it is possible for the CMS to be modified to emulate other ISAs."


It is still efficient for general-purpose code compared to other ISAs; the majority of register-register and register-memory ops are 2-3 bytes, while for something like MIPS no instruction is ever shorter than 4 bytes.

Compilers just schedule destination registers carefully, never need to swap them around.

Actually, it can happen with certain instructions that need fixed register constraints (multiply, divide, string ops) --- I've encountered a few cases where, had the compiler knew about the exchanges, it could've avoided using another register or spilling to memory. As far as I know, in modern x86 cores the reg-reg exchanges are handled in the register renamer, so they aren't slower than using an extra register and definitely faster than spilling to memory (which might happen anyway for something else if it needed the extra register.)

To witness, here is something no compiler (software) I know of can generate, even when given code that could generate it:

    theloop:
        xchg eax, edx
        add eax, edx
        loop theloop
Never say never ;-)


Been done (somewhat). The PERQ (https://en.wikipedia.org/wiki/PERQ) had a writable instruction set. And I think Transmeta was trying to do something similar with translating x86 code to an internal format for execution.


That would be cool, but I would not want to hand-code assembly on such a platform. That makes memory segmentation look fun.


The table could be created from your assembly automatically? I would never want to code the hex directly in any case.


That could work...

I thought it couldn't a second ago. I don't know why. Makes no sense to me now.


Are you using the past tense because you are going to tell us about another contemporary ISA that has more code per byte? X86 _is_ compact and this continues to be relevant to performance today.


Well, yes. That part sucks. But on x86, everything is like that. I'd rather have one weird, inconsistant thing than an amorphous, ever-shifting mass of them. Which is x86 in a nutshell.

Here's a handy heuristic: if somebody claims to know every x86-64 instruction (or even every x86-32 instruction), you can be at least 90% sure they're lying.


> Here's a handy heuristic: if somebody claims to know every x86-64 instruction (or even every x86-32 instruction), you can be at least 90% sure they're lying.

I very much prefer ARM myself, but you can probably apply the same rule of thumb to it. There are upwards of 400 instructions both in AArch32 and AArch64, with a fair number of differences between the two.

Edit: I've also posted a breakdown of the immediate limitations for ARM in this thread. It's not that complicated when sticking to the standard instructions.


...But how many of those instructions are just variants of other instructions, but with a different addressing mode?


Fair enough, some instructions have many variants. But I work with ARM assembly almost daily and I still wouldn't remember that, for example, 'VQRDMULH' is a real instruction and it stands for 'Vector Saturating Rounding Doubling Multiply Returning High Half'.


Ah, yes. A fun one. Those exist in x86, too.

I should probably count up x86 against ARM, so I have more than guesses to go on here. Maybe that part of x86 actually better.


> I should probably count up x86 against ARM

It is not easy to tell how many instruction there actually are on x86:

> https://fgiesen.wordpress.com/2016/08/25/how-many-x86-instru...


> There's a reason that x86 won and continues to win.

It is definitely, definitely not the ISA.


Ok.

It is though.


> There are only two common reasons to learn Z80 assembler, though: to program the Gameboy [...] and to program a TI calculator

What about the myriad of other (mostly vintage) computer systems and video game consoles out there? ;)

Sega Master System and Game Gear, for example.


The Game Boy doesn't quite use a Z80 --- like the Z80 it's based on the 8080, but in a different way. So you don't get things like the IX and IY or the alternate register banks, but you do get things like (a very crude set of) stack-relative addressing modes, which makes it a better fit for modern programming languages than the Z80.

http://gbdev.gg8.se/wiki/articles/CPU_Comparision_with_Z80

As an aside: most of the old, classic 8-bit micros are complete pains to write modern code for, because modern programming languages all assume fast stacks. The Z80 has no stack-relative addressing, which means you need to reserve a precious index register as a frame pointer at the top of every function, and then indirect off that --- but the Z80 designers didn't realise that people would want to do it so often and as a result it's verbose, deal slow, and doesn't handle 16-bit values. So you need to do:

    ld h, [iy+8]
    ld l, [iy+9]
...for a total of 8 bytes of code and lots of cycles.

The Game Boy processor (which doesn't have a snappy name) allows this:

    ld hl, sp+8
    ldi a, [hl] // load and increment
    ld l, [hl]
    mov a, h
...which is (IIRC) five bytes. Still not great, but shorter, and also loads faster.

If you look at the instruction encodings, the Z80's actually a pile of nasty hacks. The original 8080 is way more elegant; and there's lots of software and tooling for it, too. (But it still can't run C efficiently.)


As I understood it, the GB is based on the Z80, not the 8080: That's why its nickname is the GBZ80.

>If you look at the instruction encodings, the Z80's actually a pile of nasty hacks. The original 8080 is way more elegant; and there's lots of software and tooling for it, too. (But it still can't run C efficiently.)

I don't know about what makes an instruction encoding elegant or inelegant, so can't help you there.

Yes, the 8080 is probably more elegant, but the extra features on the Z80 are incredibly useful (especially register exchange: The Z80 had two sets of registers, which you can exchange. No, Zachtronics didn't make that up: that was a real thing, on the Z80 at least). Also, the Z80 tooling is quite nice: asxxxx and WLA-DX are fine assemblers, and SDCC is a pretty good C compiler. It sure as heck beats cc65, in any case.


They're not as common reasons.

But yes, if you want to program your TRS-80 (but only the original: later ones were 6502), or your ZX* (How many of you lot know the ZX line? Spectrum? No?), or your Game Gear, or your Master System, or any of the various CP/M machines, you have to learn Z80.


How many of you lot know the ZX line? Spectrum? No?

The ZX Spectrum and its clones were very popular in the UK, Eastern Europe, and the former USSR.


They also ruled in Portugal and Spain. Almost no C64 in sight beyond a few computer magazines.


... and for a brief period of time in the late '80s, "popular" in India as well. (I quoted popular because they were bloody expensive. In 7th grade, a kid in my class had one. He was the only one in all of the school to have a computer.).


Huh. The Spectrum was popular elswhere because it was so comically cheap.


Relatively speaking. It was definitely much cheaper than a PC Clone with an 80286 processor, but expensive enough to be a luxury, I think.


Yeah. AFAICT, most of HN (myself included) is US based. So I was curious.


Another good one for learning is the 6502. Don't forget to buy a copy of Lance A. Leventhal's "6502 Assembly Language Programming". Great book for learning not only assembly, but also the fundamentals leading up to assembly.


Amen.

I learned to program on the 6502 based VIC-20 using Lance Leventhal's "6502 Assembly Language Subroutines" as a guide by manually assembling and poking into memory. What a fun time.


Of course. Who hasn't wanted to write their own chiptunes? Grab your (emulated) Ricoh 2A03 (and your emulated Konami VRC VI, for some extra fun), and get hacking.

And thanks for the book reccomendation.


> x86 is the worst ISA.

I can only assume you've never had to program a Burroughs B90 in assembly language.


I haven't. It's probably not fun.


It wasn't easy. Very asymmetric, not many registers, and tiny stack. The only things that were written directly in it were the operating system, and the virtual machines for higher level languages such as COBOL and MPL, because it was too hard to compile to. I worked on the virtual machines.

After programming in it, I can assure you that programming in other assembly languages (including x86) is a breeze.

I should put my cheat-sheet on the web, if I can still find it.


Isn't the high level Burroughs assembly a compertly insane almost high level language? I looked into implementing it into my assembler and just ran in the other direction when I saw how much unnecessary co plexity there was in the language.


B90 was an 8 bit machine and was built at Cumbernauld in Scotland, where I worked. The B900 had similar architecture.

There was also a B1900 built in Liège in Belgium, which was a 24 bit machine whose instruction set was designed to run virtual machines (i.e. interpreters). Those systems had a reputation for being slow. I don't know much about them.

The Liège plant closed around 1982 and the Cumbernauld plant closed around 1985.

Burroughs mainframes (B5000 onwards to A series) may be the ones you're thinking of. These are justifiably praised for being ahead of their time. They were high level stack based machines with 48 bit words + 3 tag bits, and programmed directly in an Algol 60 variant, with additional instructions to enable COBOL to execute efficiently. There was no assembly language needed.


On the other hand it allowed for ESPOL and NEWP, two memory safe systems programming languages, with unsafe memory access having to be marked as such.


x86 is the worst ISA. If you want to play with assembler without feeling a desire to stab yourself and end it all, I recommend ARM.

Yes, intel is really bad, especially for learning, and while ARM is certainly better, it's pretty esoteric, and also backwards (right to left) like intel.

If you want a nice, orthogonal ISA to learn assembler on, MC68000 family is a song. The instructions are human readable, the processor is big endian, and the moves are src, dst. It's almost like a high level programming language.


I would argue for the 6502 over the 68000.

The 6502's a bit less simple to learn, but I'd say it's worth it. It worked its way into many important computers, and is arguably one of the most emulated and most used processors in existence.


I agree that the x86 ISA is pretty warty (though I have a strange fondness for it), but I'd recommend 6502 rather than Z80. There are a lot of fun retro-computer platforms that are 6502-based. Thinking of the zero-page functioning as a register-bank is really fun, too.


For 8-bits I'd recommend the 6809. Two 8-bit accumulators that can be used as a 16-bit accumulator, 4 16-bit index registers (that can largely be interchanged, except for S which is also the stack register) and you can generate pure relocatable code. And the zero-page isn't restricted to address $0000.


This sounds really interesting. I've never taken a look at the 6809 before. I happen to have a working Tandy CoCo I scored in a vintage computer haul. I'll definitely check out 6809 assembler.


I know that asxxxx will assemble for the 6809, if you need an assembler.


Yeah, but those are hard to get.


>It also didn't have memory segmentation

The Z80 only has a 16-bit address space.

My favorite CISC architecture is the 68000 series.


Yes. But you could install an MMU.

The 68k (and it's 8-bit semi-cousin, the 6809) were very nice. However, unlike the Z80, the 6502, and x86, they're no longer being made, and are increasingly rare.


The 68k is still being made in the form of the 680x0-compatible line from NXP. http://www.nxp.com/products/microcontrollers-and-processors/...

They're in the "legacy" line and labelled "not recommended for new design" but I think that's just design opinion and doesn't imply it's no longer made.

I'm sure this video has done the HN rounds before. It's a slow but fascinating watch. "Motorola 68000 Oral History Panel" from original Motorola team members. https://www.youtube.com/watch?v=UaHtGf4aRLs


68k Macs are easy enough to find on eBay. Grab one of the old Powerbook 1xx series laptops and you have a self-contained development station that's less expensive than many modern microcontroller dev kits.


I'd recommend learning 68k instead. Or MIPS.

But definitely Z80 over x86. x86 is pain.


I wish 68k, but it's out of production (as is the 6809, its equally awsome 8-bit cousin).

MIPS and ARM are fun, though, AFAICT. As is 6502.


If you want something friendly and CISC-y and modern, check out the Renesas RX600 and friends. They have a nice instruction set and zero-wait-state RAM and ROM, so writing assembly by hand with predictable timing ought to be easy.

https://www.renesas.com/en-us/doc/products/mpumcu/doc/rx_fam...


Hey. Neat. Thanks for the link.


(author here) ...FYI I actually learned assembly language for the first time back in the 1970s and 80s using a 6809 inside a Radio Shack "Color Computer." I was super-fun at the time. I don't remember much of it now but I'm sure x86 isn't as clean or fun as 6809 assembly was.


Spoiler: it wasn't. I've looked over both, and that's pretty immediately clear.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: