Maybe a bit cynic, but given their churn in the lower level specifications (supervisor mode and up) my guess is:
Main goal: write lots of theses and papers and obtain degrees.
Secondary goal: have an unencumbered ISA to help with the main goal.
Everything else depends on those who adopt the ISA. If some org is willing to spend the multi-million $$$ necessary to design a high-performance desktop class CPU (and more to make it a reality), it will happen.
Those low-end chips are easier to do, and an easier market to serve: The ISA's license fees are a bigger chunk of the chip's price, so RISC-V dropping that to 0 helps more. You also don't need to build a high-end chip that blasts everything else out of the water before your newcomer is even considered, since in the microcontroller market there are several pockets of "good enough".
If that works out, there might be attempts to scale up later (when they have experience and financial resources to work with).
> Everything else depends on those who adopt the ISA. If some org is willing to spend the multi-million $$$ necessary to design a high-performance desktop class CPU (and more to make it a reality), it will happen.
It's a good sign that NVIDIA and Qualcomm are both integrating RISC-V for embedded controllers on their chipsets. This means that there will be experience and maybe R&D going into RISC-V cores at these two massive vendors of embedded and mobile CPUs. It is not inconceivable that RISC-V could... cross-contaminate to their high-performance application processor business.
Vendors benefit from standardization when it comes to the ISA, especially in the commodity/interchangeable market. Qualcomm is currently competing with other vendors to run the same software (Android or Windows). If they make ISA extensions which are not likely to be used by that software, then they're probably wasting their time, or those extensions were not that useful for the general community to begin with.
Useful ISA extensions are likely to remain royalty free at least, if not standardized at the level of the foundation; but if they don't, it's not the end of the world.
Lowrisc.org is worth checking out if you're hoping for something that could run Linux. I like their concept of minion cores, as it makes them more competing in the Beaglebone space, versus the Raspberry PI space. The ability to do real-time-ish IO via offloaded processors is a win. And it doesn't need to end up in a $5 board to succeed.
Something I was thinking about the other day was what will happen when we hit mass market process limits, Intel has always stayed ahead of it's competitors outside of x86 land even when their architectures where not competitive by having the best process but when the rest of the world catches up and suddenly I can order a 20bn transistor 3nm (or whatever) chip with RISC-V cores then things start to look a lot different.
Throw in that we already have a mainstream (if you squint) desktop OS that is a compile away (and has driver support for a lot of hardware) and if I where Intel I'd be worried in the mid-term.
Hell for the first time since the first AMD64's came out Chipzilla is looking vulnerable from the AMD direction, a surprise to a lot of people (including myself, I bought a Ryzen 1700 for work back in late May and I've been astounded by the performance per £ on my workloads).
Intel has a lot of threats lately. Things seem to be moving towards a small die and MCM strategy to get better yields and move investment towards the package and FSB. AMD made a huge investment in that direction with Zen arch and Infinity Fabric - Threadripper's big package and high core count has opened a new tier of enthusiast desktops - and they may have plans for furthering that on the GPU side of things(Vega is positioned for efficiency at lower wattage, which is poorly reflected in the "hot-and-loud" flagship releases. It's expected to pair well with Zen in the forthcoming Raven Ridge APUs though.)
Intel shares with Nvidia(also currently using large dies) a limited form of performance leadership at the high end that may be eroded in the face of these shifts. Most of the market is going to see more benefit from more cores instead of faster cores with the current conditions. That might be less true once we're talking about a baseline of 16 or 32 cores, but 2 is proving to be too few for current workloads.
And that's just their nearest competitor. ARM SOCs have consumed the mobile market and are creeping upwards. My ARM Chromebook makes a pretty good Linux desktop too, and you can practically trip over RaspPi devices.
x86 still has a lot of room left for microarchitectural optimisation, precisely because it's a CISC --- complex instructions can be decoded and scheduled internally in different ways, microcode itself can be optimised, and ultimately complex operations (like AES or SHA) eventually dispatched to dedicated hardware. The cycle count of instructions can be reduced in this manner. Besides the aforementioned AES and SHA extensions, other examples of x86 microarchitectural optimisations include REP MOVS/STOS, DIV, and MUL.
In contrast, the simplicity of RISC means such optimisations are either not possible or much harder --- if by definition all instructions already run in one clock cycle on one execution unit, then the only choices are to add more execution units (something the CISCs can also do with the same amount of difficulty) or attempt to recognise sequences of instructions and combine them for execution on dedicated hardware; the key difficulty here being that recognising and combining instruction sequences is much harder than decoding a single instruction to a sequence of uops or dispatching it to hardware internally.
Mul/Div are part of the M extension, which any non-trivial/tiny embedded implementation have. The chip in this link for instance supports this extension. Likewise AES and SHA could be supported in a crypto extension.
REP MOVS/STOS is not optimized at all. The Architectures Optimization Reference Manual says that SIMD instructions are faster. The only reasons to use REP MOVS/STOS are to deal with unaligned parts of the data and to not have startup cost from checking if SIMD is available via CPUID when you have only a few bytes. Further the intel manual tells you to not use the other string instructions at all.
> if by definition all instructions already run in one clock cycle
Then you are using a different definition of RISC than RISC-V does
> the key difficulty here being that recognising and combining instruction sequences is much harder than decoding a single instruction to a sequence of uops or dispatching it to hardware internally.
Why is that supposed to be more difficult? Matching combinable instructions is trivial.
That might've been true for a period some time ago, but before and after that it has been the preferred way to do memcpy() or memset() since it can operate on entire cachelines at a time. It's one of the fastest, if not the fastest, and also the absolute smallest (which helps with icache consumption), while leaving the SIMD registers free to do more... useful things than shuffling data around.
Why is that supposed to be more difficult? Matching combinable instructions is trivial.
How is it "trivial" to match a round of AES or SHA, or a multiplication or division algorithm, or even a memory copy/store loop, so they can be replaced with the optimal hardware implementation...?
”The ISA's license fees are a bigger chunk of the chip's price”
I don’t know much of CPU IP licensing, but I would think one pays more for the design of the CPU than for the ISA.
And that’s probably for the better, because, if one pays for the ISA, but the RISC-V ISA is free, how are ‘they’ going to get ”financial resources to work with”.?
> I don’t know much of CPU IP licensing, but I would think one pays more for the design of the CPU than for the ISA.
You have a couple of options:
- Use a ready made CPU design (eg. Cortex-M) and pay license $$ for each chip sold.
- Use an existing ISA (eg. ARMv4), build your own CPU around it and pay license $ for each chip sold.
- Use RISC-V, build your own CPU around it and pocket those $ you'd otherwise pay for the ISA license (or reduce your price to become more competitive, or something inbetween)
(And then there #4: Design your own ISA.
Now you also have to deal with standard software: compilers, kernels, some libraries. Still may make sense in some cases, but why go through the trouble if all you want is to save some pennies per chip on the ISA?)
In all but the first case, your CPU design is a one-time cost that becomes marginal over the (hopefully many) chips you sell. Especially microcontrollers can be _very_ high volume where this makes sense (more so with the IoT hype that's going on).
Which is why I think it's a wise idea for them to aim for that market first and build up experience and a war chest.
Maybe a bit cynic, but given their churn in the lower level specifications (supervisor mode and up) my guess is:
Main goal: write lots of theses and papers and obtain degrees.
Secondary goal: have an unencumbered ISA to help with the main goal.
Everything else depends on those who adopt the ISA. If some org is willing to spend the multi-million $$$ necessary to design a high-performance desktop class CPU (and more to make it a reality), it will happen.
Those low-end chips are easier to do, and an easier market to serve: The ISA's license fees are a bigger chunk of the chip's price, so RISC-V dropping that to 0 helps more. You also don't need to build a high-end chip that blasts everything else out of the water before your newcomer is even considered, since in the microcontroller market there are several pockets of "good enough".
If that works out, there might be attempts to scale up later (when they have experience and financial resources to work with).