From the RISC-V Privileged ISA Specification [https://riscv.org/specifications/p...

FullyFunctional · on Feb 15, 2019

To my knowledge, no other _widely deployed_ ISA has had this level of flexibility and foresight. In fact it's usually the opposite; the ISA is usually tightly coupled to the processor implementation that introduced it and the survival (= market success) of the immediate product is all that matters. However if it's successful, the successor(s) are now locked into support all the legacy of its predecessors.

I can only think of two other examples that successfully planned for the future: the IBM 360 (extremely CISC, but still alive today) and the DEC Alpha (beautiful design, but now mostly dead).

gmueckl · on Feb 15, 2019

The Alpha disappeared because DEC went under and HP buried the CPU in favor of the Itanium, which is now also dead (mostly because of its technical shortcomings).

pjmlp · on Feb 15, 2019

Itaninum is dead thanks to cross licensing between Intel and AMD.

If the choice had been Itanium or bust, it would have turned out much different outcome.

tremon · on Feb 15, 2019

Not really, the Itanium VLIW architecture bet heavily on instruction-level parallellism as opposed to thread-level parallellism. In theory, the Itanium could issue and retire 3 instructions per cycle thereby making it competitive with x86 even on modest clock speeds.

The main problem was that not many programs could sustain 3 parallel instructions in their critical path, which meant that the compiler would often generate NOPs to fill the empty instruction slots. IIRC the Itanium typically achieved around 40% of its theoretical performance on conventional workloads. The term "NOP density" was coined specifically to research this problem.

There is another interesting observation in [1] that I haven't realized before: even if the compiler were to succesfully generate 3 instructions per cycle, the processor then had to possibly fetch 3 memory locations in that instruction cycle. If two of those were already in cache, the instruction would still stall on the third memory fetch. Contrast this with the implicit parallellism of hyperthreading, where the processor can continue executing a different thread when the current thread encounters a memory stall.

[1] https://softwareengineering.stackexchange.com/questions/2793...

pjmlp · on Feb 15, 2019

If Intel decided they would only produce Itaniums, without an AMD around to come up with the idea to create AMD64, we wouldn't have any option than to live with those shortcommings and eventually get improved designs.