Hacker News new | past | comments | ask | show | jobs | submit login

I will recount one mystery that I personally wondered about for decades, and that coincidentally the author of this article might have shed most light on:

The 386 was the first x86 processor to introduce four "Control Registers", CR0-CR3. The first one, CR0, was technically already introduced with the 286, but then still named "Machine Status Word" and manipulated with its own set of different , explicit instructions (that still survive to this day).

Now, if you look at Intel's documentation of the 386, or any later CPU in the line--including today's Intel CPUs in Laptops, Desktops, and Servers--you will see that the second of those registers, CR1, is entirely "reserved". No single bit in it can be accessed, neither reading or writing.

This is bizarre not only because CR1 was introduced by the 386 along with CR2 and CR3 (which are defined and common), but also because the successor, the 486, introduced a new Control Register, CR4, instead of starting to use reserved bits in CR1. This is despite CR4 sharing its characteristics with CR0 (and unlike e.g. CR2): It's mostly a bit field for global processor state. So while you could have assumed that CR1 existed as planned "overflow" to add new control bits to once CR0 became full, the seemingly simple addition of CR4 in the immediate successor goes against that theory.

Decades ago, I even wrote to the 386's chief architect to try to settle that question... he must have not understood my question, because he just replied that according to Intel documentation, that register is "reserved".

But incidentally, a while ago I talked about that mystery with the author of this article, and received the most plausible theory so far: It turns out that an early pre-release document shows that the 386 was originally planned to contain an on-chip cache. Evidently, that part of the plan must have been scrapped, because the 386 shipped without an on-chip cache, which was only added with the 486 (maybe they ran into problems with implementing a cache controller, or maybe at the time so much on-chip SRAM would have made for a prohibitive price point).

It is therefore not unlikely that CR1 was once meant, and in prototypes maybe even did, control the cache. Once that cache was removed, CR1 was just made "reserved", and not repurposed in the 486 and later CPUs out of an abundance of caution for compatibility: Accessing CR1 reliably causes an Undefined Opcode trap, and maybe some important software at the time relied on that in a bad way.




Now I wonder if inside the modern Intel processor silicon that register actually exists (and what's it actually plugged into?)


Yeah, I have wondered the same about the register in the original 386, it's part of the fun for this mystery. Unfortunately, while simpler CPUs like the 6502 and even the 8086 itself have been reverse engineered to great extent (the 6502 pretty much fully, the 8086 at least very extensively), briefly talking to members of the community that achieved those feats (some of which are also active here by the way, like Ken Shirriff) made it apparent that, at least for now, reverse engineering a 386 is still out of reach.

I would guess that reality is pretty boring and that MOVing from and to the register is more or less hardcoded to the Undefined Opcode trap. For anything modern, I'm certain beyond reasonable doubt that that's the case.

But for the 386 we cannot be sure, and if the cache control theory for example is correct, I wouldn't be surprised at all if there are remnants of the former design.

The best outcome would of course be that the register exists in the 386, does something amazing, and just needs to be enabled somehow. Chances for that are rather slim.

In the meantime I have an eBay search for 80386 engineering samples saved. But not even the searches for specific serial numbers of known buggy very early 386 CPUs (might be samples as well), that were prevalent enough that some publications warned against them[1], ever turned anything up at all in years.

https://www.pcjs.org/blog/2015/02/23/


Yeah, reverse-engineering a 386 is way beyond what I can do. The 6502 has about 3500 transistors, the 8086 has about 29,000, and the 80386 has about 275,000 transistors. So it's almost an order of magnitude of complexity each time. Not to mention that the 80386 has two metal layers, which makes it a whole lot harder to see what's going on.


386 may be a bit too far, but there was a high resolution shot of an 80286 on the visual6502.org wiki (now sadly offline), were you could almost read the microcode ROM visually.

I spent a lot of time staring at that and got at least a good part of the opcode pattern PLA. Many bits were so blurry that I could only guess what they are, based on what other opcodes I already had. The actual microcode is 53760 bits with a very high density, where the difference between a 0/1 comes down to a fraction of a pixel. I think with a better image, it might be possible to read those bits and even automate it.

Not sure if there are any mysteries left to solve though. In the PLA there are patterns with "don't care" bits to match opcodes 64-67 and 0F 07 to 0F FF, so all of these must be invalid (they go to the same address in microcode).

Some time ago I found out what the last undocumented opcode 0F 04 - and the F1 prefix - do by experimenting[1]. One slightly odd thing about that instruction is that the CPU seems to grant bus requests for DMA/refresh, but then completely ignore it and write at the same time.

On the first machine I tested this on, this caused some crashes and spurious I/O that I didn't notice at first, but on another one it did absolutely not work unless DRAM refresh was disabled. Maybe there is something in the microcode that explains this, but more likely it has to do with other parts of the hardware?

[1] https://www.vcfed.org/forum/forum/technical-support/vintage-...


Thanks for the confirmation! Maybe instead some day the 80386 is considered so old and historic that Intel makes some design documents available, but that practice does not seem terribly common...


286 was an RTL model hand converted module by module to transistor/gate level schematic. Afair 386 was the first Intel CPU where they fully used synthesis (work out of UC Berkeley https://vcresearch.berkeley.edu/faculty/alberto-sangiovanni-...) instead of manual routing. Everything went thru logic optimizers (multi-level logic synthesis) and will most likely be unrecognizable.


That’s indeed interesting. Assuming detailed pictures, it sounds then that finding the original purpose might only be feasible if there are any very significant remnants of the registers purpose… (And then it seems much harder, but not impossible, a bit akin to reading very optimized compiler generated machine code).


I found a paper on this 'Coping with the Complexity of Microprocessor Design at Intel – A CAD History' https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22... There was another one dedicated to 386, but I dont remember the title :(

on the other hand there are insane people able to RE Playstation CPU/GTE back from standard cell array chip http://www.psxdev.net/forum/viewtopic.php?t=551&start=60




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: