SiFive U8-Series Core IP

sanxiyn · on Oct 25, 2019

This is a press release. There is a blog post going a bit more technical here: https://www.sifive.com/blog/incredibly-scalable-high-perform...

monocasa · on Oct 24, 2019

The floating point unit is optional on the U8, which is a non standard choice for a OoO design. Normally if you've got an OoO frontend, sticking a FPU is a drop in the ocean (or can be).

Is this an artifact of them using BOOM (which is just an assumption on my part), since BOOM makes it ridiculously easy to add or remove backend execution units?

reitzensteinm · on Oct 24, 2019

There are an absolute ton of customization options available. In the context of the decisions that can be made, making FP optional isn't surprising.

I do have to wonder about whether they've got tools for real customers that can effectively test the tradeoffs being made, at least for performance options like cache size & associativity. Even with all the options at your disposal and an intimate knowledge of your problem domain, beating a handful of well made and broadly tested off the shelf designs won't be easy.

I love this stuff, but I'm a bit skeptical if it's actually something the market is asking for. If there were more substantial options like crypto accelerators, GPGPU, video encoding/decoding, FPGA, mixing big and lots of small cores, vector extensions, being able to size up the ALU for specific operations... I could see the pain of custom silicon paying off.

Having accelerators sharing the cache hierarchy is not something the big players have really done much of yet and probably won't until there's so much dark silicon they can more or less throw the kitchen sink in.

https://scs.sifive.com/core-designer/customize/f9c3a7e1-08b7...

brucehoult · on Oct 25, 2019

One benefit of having customers customize their own cores, is they can iterate their settings (on FPGA) in hours and test the result on their actual workload, not just guess how some semi-standard core might perform on their workload based on SPEC or CoreMark or something.

The announcement says the U87 with (Cray-like) vector processing will be available in H2 2020.(Also ARM SVE-like, but ARM hasn't shipped anything with it yet, and I think hasn't even announced anything with it? Fujitsu has, in a supercomputer) Meantime, the U84 has only scalar integer and FP, with nothing equivalent to NEON on ARM. That might or might not matter to you.

Mixing a reasonable number of cores (up to nine) in the automated SoC configuration tool was announced alongside the 7-series (dual issue in order) cores in October last year. That continues. If you want more cores than that some engineering time would be required to verify it.

https://sifive.cdn.prismic.io/sifive%2F5ec09861-351b-420c-b6...

Adding QuickLogic eFPGA to a SiFive core has been announced:

https://www.prnewswire.com/news-releases/quicklogic-teams-wi...

reitzensteinm · on Oct 25, 2019

Very cool, thank you!

While you can iterate pretty quickly on an FPGA, I'd still be worried that cutting down a chip to optimize for $/perf works until it doesn't. Something off the shelf would be pretty resilient to algorithm tweaks (ad absurdum you have an Intel chip which performs black magic to run almost anything you throw at it well).

It's pretty hard to overcome the economy of scale of just buying a mass produced chip that's twice what you need but ten times cheaper because they're made in batches of a million.

I should note that my skepticism is coming from a place of "I want this to be a thing". Using the web page to design a chip is an instance of "live in the future and build what's missing" if I've ever seen one.

mlyle · on Oct 25, 2019

This is not to beat commodity processor/controller silicon. If you are going to build a SoC for whatever reason (peripherals, integration, etc) -- you might as well make the integrated processing exactly what you want.

brucehoult · on Oct 25, 2019

An article on golem.de cites (Google translated) "According to Sifive, the U84 supports a heterogeneous core complex: four U84, four U74 and one S2 core can be combined."

The U84 is OoO comparable to Cortex A72. The U74 is comparable to A55. The S2 is comparable to Cortex M0/M3/M4 (depending on configuration) but with 64 bit registers and addressing.

https://scr3.golem.de/screenshots/1910/SiFive-U84-U87/thumb6...

https://www.golem.de/news/risc-v-sifives-u87-kern-nutzt-vect...

archi42 · on Oct 25, 2019

Depending on your use-case, a worst case execution time (WCET) analysis can yield these numbers. These are usually deployed for analysis of safety critical systems (e.g. for ECU in cars or industrial controllers), but it could be (ab-)used for that case as well. Of course the analysis needs to support both your ISA (RISC-V) and the exact core implementing it, plus adjusting the parameters; this isn't exactly cheap, since it requires a lot of very specific know-how, and I wouldn't suspect SiFive will produce something like this (Disclaimer: I work in the field).

microcolonel · on Oct 24, 2019

I don't think U8 is BOOM-based, but RISC-V decoders are pretty trivial, so they probably had some extra time and patience to make the rest of the frontend more flexiblep

monocasa · on Oct 24, 2019

It certainly reads like it's based on BOOM; they're describing BOOM's gate count niche to a tee. And their previous designs were based on rocket.

jabl · on Oct 24, 2019

So, is the U8 core based on BOOM, or is it a ground up design? Or if neither, then what is it?

And the HBM2E+ is some doohickey so customers can combine a RISC-V core with HBM2 memory?

(and if somebody from SiFive happens to read this, there's a typo, presumably HMB2E+ means HBM2E+)

brucehoult · on Oct 25, 2019

Fixed now, thanks.

sanxiyn · on Oct 25, 2019

I still see HMB2E+.

brucehoult · on Oct 25, 2019

Really fixed :-)

phkahler · on Oct 24, 2019

How can it support the vector extensions? I didnt think the spec was done yet. Is it?

brucehoult · on Oct 25, 2019

The U84, available to lead customers now, doesn't have the vector extensions. The U87, announced to be made available in H2 2020, will.

The vector extension spec is 99% done. There shouldn't be anything more than minor tweaks from now.

Note that part of the RISC-V extensions philosophy is that specs are not finalized until (preferably) multiple implementations have been made and shown to work as expected in the real world.

sanxiyn · on Oct 25, 2019

Is there another implementation of vector extension?

giomasce · on Oct 25, 2019

AFAIU one of the nice thing of the SiFive IP cores is that they were distributed with some kind of free license (although they still required other non-free IPs to be actually used in a chip). Is this still (or was it ever) true?

_chris_ · on Oct 25, 2019

AIUI, the U5 series is BSD/open-sourced as "Rocket", and a version of their shared L2 cache is available as the sifive-inclusive cache. I believe Berkeley uses these artifacts as part of their FireSim (https://fires.im) FPGA-based infrastructure.

SnorkelTan · on Oct 24, 2019

Curious what about this technology makes it front page noteworthy? Has the deep learning community been hotly anticipating the arrival of this technology, or is it more along the lines of stuffing the ballot box?

monocasa · on Oct 24, 2019

It's the announcement of the highest perf RISC-V core that you'll be able to put your hands on so far.

qlk1123 · on Oct 25, 2019

Any stats for that? Being the highest means there are other lower ones. What are other RISC-V players?

monocasa · on Oct 25, 2019

Of what you can buy today: Kendryte K210, GigaDevice GD32V, and SiFive has FE310 and U54MC.

lsllc · on Oct 25, 2019

"you'll be able to put your hands on" ... the million dollar question: When???

brucehoult · on Oct 25, 2019

If you're a company making your own SoCs already and want a high performance RISC-V core for a mid-range tablet or phone or an SBC equivalent to the Raspberry Pi 4 or something like that then the answer is "now".

If you're someone wanting to buy a quantity one chip from Digikey or Mouser then it'll be ... when someone decides to make and sell a retail chip. Which might well be someone making their own chip for a tablet or set-top box etc and selling the surplus production.

monocasa · on Oct 25, 2019

For sure. SiFive has been pretty good at delivering so far.