We briefly note that the entire privileged-level design described in this document could be replaced with an entirely different privileged-level design without changing the user-level ISA, and possibly without even changing the ABI. In particular, this privileged specification was designed to run existing popular operating systems, and so embodies the conventional level-based protection model. Alternate privileged specifications could embody other more flexible protection-domain models.
Do many/most other architectures have this attribute? The idea of replacing only one "side" of an architecture has never crossed my mind before and seems pretty cool.
To my knowledge, no other _widely deployed_ ISA has had this level of flexibility and foresight. In fact it's usually the opposite; the ISA is usually tightly coupled to the processor implementation that introduced it and the survival (= market success) of the immediate product is all that matters. However if it's successful, the successor(s) are now locked into support all the legacy of its predecessors.
I can only think of two other examples that successfully planned for the future: the IBM 360 (extremely CISC, but still alive today) and the DEC Alpha (beautiful design, but now mostly dead).
The Alpha disappeared because DEC went under and HP buried the CPU in favor of the Itanium, which is now also dead (mostly because of its technical shortcomings).
Not really, the Itanium VLIW architecture bet heavily on instruction-level parallellism as opposed to thread-level parallellism. In theory, the Itanium could issue and retire 3 instructions per cycle thereby making it competitive with x86 even on modest clock speeds.
The main problem was that not many programs could sustain 3 parallel instructions in their critical path, which meant that the compiler would often generate NOPs to fill the empty instruction slots. IIRC the Itanium typically achieved around 40% of its theoretical performance on conventional workloads. The term "NOP density" was coined specifically to research this problem.
There is another interesting observation in [1] that I haven't realized before: even if the compiler were to succesfully generate 3 instructions per cycle, the processor then had to possibly fetch 3 memory locations in that instruction cycle. If two of those were already in cache, the instruction would still stall on the third memory fetch. Contrast this with the implicit parallellism of hyperthreading, where the processor can continue executing a different thread when the current thread encounters a memory stall.
If Intel decided they would only produce Itaniums, without an AMD around to come up with the idea to create AMD64, we wouldn't have any option than to live with those shortcommings and eventually get improved designs.
We briefly note that the entire privileged-level design described in this document could be replaced with an entirely different privileged-level design without changing the user-level ISA, and possibly without even changing the ABI. In particular, this privileged specification was designed to run existing popular operating systems, and so embodies the conventional level-based protection model. Alternate privileged specifications could embody other more flexible protection-domain models.
Do many/most other architectures have this attribute? The idea of replacing only one "side" of an architecture has never crossed my mind before and seems pretty cool.