About the ZipCPU

RossBencina · on Oct 4, 2018

In addition to the ZipCPU itself, the zipcpu.com website has many useful blog posts for the beginning FPGA developer. I've found it to be a great resource. Mr Gisselquist (a.k.a. ZipCPU) is an active contributor to a number of FPGA forums and, in my experience, is always helpful to newcomers.

crazygringo · on Oct 4, 2018

I'm curious, is there any commercial application for running Linux on a CPU implemented in FPGA, as opposed to any (presumably cheaper and faster) normal CPU? Some kind of custom modification this would allow you to make?

I've always thought of FPGA's as being for custom parallel computationally intensive tasks that run side-by-side with a normal CPU. So just curious if ZipCPU is just for fun, or something more?

(Or is there even something security-related about being able to verify not just the code, but the CPU it runs on, in this case?)

monocasa · on Oct 4, 2018

Sure, it happens all the time if you want a full os on a core that's tightly integrated with some custom hardware, but you don't want an ASIC (for reconfigurability, or just low volumes).

skybrian · on Oct 4, 2018

I'm not a hardware engineer, but my understanding is that some FPGAs include an entire cpu core as part of the chip? This seems much more efficient than designing your own.

TomVDB · on Oct 4, 2018

I wrote this in a different discussion yesterday:

https://news.ycombinator.com/item?id=18124411

Super small CPU cores are fantastic for complex FSM replacement, especially on an FPGA which has unused block RAMs (so using them is essentially free.) I’ve seen cases with relatively complex pure HW FSM being replace by a small CPU and 1KB of RAM where the logic used by the CPU ends up smaller than the FSM.

But now the CPU is programmable, so you can iterate much quicker in case of bugs, without the need to reaunthesize.

A good example are SDRAM controllers: almost all of them have a small CPU inside the controller that is used for calibration training.

jtregunna · on Oct 4, 2018

It's almost always more efficient to use an existing CPU if it fits your use case than to design your own. Even if it doesn't, it is often more efficient (time wise) to adapt your use case to the hardware you have available, or write software to resolve the feature you want in hardware but don't have. Only the especially nutty build their own CPUs (I'm especially nutty too)

monocasa · on Oct 4, 2018

Even if you're not designing your own, dropping a little Microblaze or NIOS down can simplify your design considerably.

jtregunna · on Oct 4, 2018

Oh I know it can, my comment was more tongue in cheek not dismissive :)

akiselev · on Oct 4, 2018

The cheapest FPGAs are under $10 and many can fit a small softcore - usually not a full pipelined, cached + MMU RISCV core at 100+ mhz but enough to drastically simplify high level logic in FPGA projects. The Zynq series with onboard CPUs start at $40-50 in low volumes and have 1-4 cores at 600 mhz which is complete overkill for many FPGA projects.

kevin_thibedeau · on Oct 4, 2018

If all you need is a lightweight microcontroller in a low end FPGA it will be usually be more cost effective to use a soft core.

cushychicken · on Oct 4, 2018

Yep - that's what the Xilinx Zynq series of chips is. An ARM Cortex A9 core (or two!) and a big chunk of FPGA logic, tied together by AXI bus fabric.

zipcpu · on Oct 4, 2018

You could even do it if you wanted an ASIC. FPGA's are commonly used for verifying the logic that will eventually get placed into an ASIC.

nickpsecurity · on Oct 4, 2018

I considered doing it just to run security-critical apps on CPU's like CHERI that I couldn't get in ASIC form:

https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

Slow, $2000-a-unit hardware might be worth it if it makes your OS close to unhackable. I'd just have to make sure apps are lean and efficient. I already favor those, though. :)

Also, there's been work in "instruction set randomization" that obfuscates the CPU itself to reduce odds malware will work. Mostly widely-deployed, non-custom attacks but might stop some custom attackers, too.

https://www.ics.forth.gr/_publications/papadog-asist-ccs.pdf

Older work was useful for making RAM and other things either untrusted or less trusted with crypto setups. Joshua Edmison did one of those with a nice survey of others:

https://vtechworks.lib.vt.edu/bitstream/handle/10919/29244/e...

Finally, Meltdown/Spectre created lots of designs recently to plug timing channels. Too many to even track. Here's the first one I found from my submissions:

https://arxiv.org/pdf/1703.07706.pdf

petra · on Oct 4, 2018

Considering the implications of a widely available unhackable computer on real security, crime and politics, and that many powerful people/organizations are working to prevent distribution of this technology, do you really think it would become available ? how ?

nickpsecurity · on Oct 4, 2018

Computers that are ridiculously hard to hack and methods for building them have been available for some time. NSA used to evaluate them for years on end trying to hack them. If they passed the test, the got the TCSEC A1-class label for verified[-as-possible] security. SCOMP/STOP, GEMSOS, Boeing SNS Server, LOCK, and KeyKOS w/ KeySAFE are examples of systems designed with such techniques or certified. I think STOP-based XTS-400 and Boeing SNS Servers have been in deployment for decades now without any reported hacks. They were on the munitions list for export restriction (idk if enforced), some companies sell high-security to defense only, and they all cost a ton of money if proprietary.

Under newer standards and techniques, examples include INTEGRITY-178B, Perseus Security Framework (eg Turaya Desktop), seL4, Muen hypervisor, and ProvenCore. Two of those are open source with one having pieces that were released. On hardware side, the groups developing it often openly describe the stuff where anyone with money can implement it. Some openly release it like CHERI CPU w/ FreeBSD. Some commercial products making every instruction safe like CoreGuard.

So, it's not a question of whether people will develop and sell this stuff. They've been developing this stuff since computer security was invented back in the 1970's. It was sometimes the inventors of INFOSEC developing and evangelizing it. Businesses didn't buy the stuff for a variety of reasons that had nothing to do with its security. Sometimes, esp with companies like Apple or Google, they could clearly afford the hardware or software, it was usable for some to all of their use case, and they just didn't build or buy it for arbitrary reasons of management. Most stuff they do in-house is worse than published, non-patented designs which is just more ridiculous.

DARPA, NSF, and other US government agencies continue to fund the majority of high-security/reliability tech that gets produced commercially and/or for FOSS. These are different groups than the SIGINT people (i.e. BULLRUN) that want to hack everything. Also, they might be putting one backdoor in the closed ones for themselves while otherwise leaving it ultra-secure against everyone else. That's what I've always figured. Lots of it is OSS/FOSS, though, so that's easier to look at.

stonogo · on Oct 4, 2018

Infrant used to sell a NAS product that ran Linux on a Sparc implemented in FPGA. No idea why, but it was a commercial success. They moved to ARM or x86 around the time Netgear acquired the company.

jeanmichelx · on Oct 4, 2018

Easy to add custom hardware to accelerate the things you care about.

MaxBarraclough · on Oct 4, 2018

Strange that there's no mention at all of RISC-V. RISC-V also runs on FPGA, right?

Does it really justify yet another RISC instruction-set?

Impressed that there's apparently already a GCC backend though.

rwmj · on Oct 4, 2018

The project was started in Sept 2016 which was also the same month that I was first porting Fedora to RISC-V. Given the momentum behind RISC-V and also the huge amount of software now available it would obviously make sense to use a RISC-V compatible design. However hobbyists can do what they want, there's no reason to force someone who wants to design their own CPU to do it in any particular way.

zipcpu · on Oct 4, 2018

I actually started the project almost a year earlier, and made many mistakes before finally presenting at ORCONF in September of 2016.

That said, I am considering a RISC-V compatible design. I just haven't committed to doing so.

zipcpu · on Oct 4, 2018

The specification actually mentions RISC-V, and why I didn't use it. In hind sight, it's harder to decode the RISC-V instruction set than the one for the ZipCPU--to many holes in strange places.

scottlamb · on Oct 4, 2018

Where does it mention RISC-V? I'm curious to read it. I hit ctrl-F to search for RISC-V in https://github.com/ZipCPU/zipcpu/blob/master/doc/spec.pdf and only found a brief mention under "2.2.7: Operand B".

zipcpu · on Oct 5, 2018

Check out page ix in the preface.

ballooney · on Oct 4, 2018

No, it is not strange that there’s no mention of RISC-V.

MaxBarraclough · on Oct 4, 2018

Your point?

ballooney · on Oct 4, 2018

Flask’s page doesn’t mention django, an avr microcontroller’s page doesn’t mention x86, my bash static site generator readme doesn’t mention wordpress...

None of this is strange.

MaxBarraclough · on Oct 4, 2018

Why are you being obtuse, rather than trying to be clear?

RISC-V is the major Free and Open CPU. RISC-V can run on FPGA. It's natural to want a comparison of this project against RISC-V, especially considering the write-up mentions the less successful OpenRISC project.

monocasa · on Oct 4, 2018

This page appears to have been written in summer of 2016. Back then RISC-V didn't even have a stable privileged instruction set yet, so there wasn't really a RISC-V to compare it to in a meaningful sense.

alsadi · on Oct 4, 2018

Not all fpgu are equal. Maybe this targets those with lower specs.

rwmj · on Oct 4, 2018

PicoRV32[0] targets some extremely low spec FPGAs. I have it running on ridiculously small and cheap FPGA[1] that has 8k LUTs and cost me £41 including tax and next day delivery.

However I repeat what I said above: hobbyists are free to design their own instruction sets, and good on them!

[0] https://github.com/cliffordwolf/picorv32

[1] https://rwmj.wordpress.com/2018/03/17/playing-with-picorv32-... https://rwmj.wordpress.com/2018/03/18/playing-with-picorv32-...

zipcpu · on Oct 4, 2018

The ZipCPU and several peripherals can fit within an iCE40 8k using only about 5k LUTs--and that's with the multiply unit, the divide unit, the pipelined fetch unit, and a debugging bus infrastructure. While that isn't the full pipelined implementation neither do the caches fit, it is sufficient to be competitive with the picoRV.

TomVDB · on Oct 4, 2018

The picorv32 is a nice little processor, but it's surprisingly inefficient in its use of FPGA resources compared to the performance it offers (which is very low.) For small FPGAs, it's far from being the best solution.

That said: there are a bunch of SOC construction tools for it, so that's definitely a redeeming factor.

jimktrains2 · on Oct 4, 2018

The second link redirected me after a few seconds to a fake Amazon site. Interesting, maybe I need to update my address to list for my host file.

rwmj · on Oct 4, 2018

It's a standard wordpress blog, so I don't know what's going on there. I always browse with Javascript whitelisted.

detaro · on Oct 4, 2018

malicious ad hijacking the browser?

snaky · on Oct 4, 2018

To design the new ISA to be any good, is not that easy.

> ARM paid millions to Hitachi to use SuperH patents in Thumb instruction set

http://j-core.org/ELC-2016.pdf

Tor3 · on Oct 4, 2018

RISC-V is already silicon. As in a SoC, from (at least) SiFive.

MaxBarraclough · on Oct 4, 2018

Sure, an SoC exists, but it runs on FPGA too.

Tor3 · on Oct 4, 2018

But then again nearly every CPU runs on FPGA these days. 6502, Z80, various mainframe CPUs, x86, and more modern CPUs. Modern CPU development often goes via FPGA - after all, the CPU development tools are the same: Verilog, for example. Which you first synthesize to FPGA, and, after some testing rounds, to silicon.

fer · on Oct 4, 2018

You may be able to run certain modern CPU units on an FPGA, maybe even one modern core, but compare the 4 million LUTs the biggest FPGA has to the 19 billion transitors a 32 core AMD EPYC has. With a LUT wildly variable equivalence of 1 to 1-50 ASIC transistors depending on the logic implemented, there's still one order of magnitude gap between them at best.

In practice, Verilog simulators are used instead, they're slow as a dead horse on smack, but it's the only way to have full (logic) system integration of such big designs.

monocasa · on Oct 4, 2018

There's a halfway point, confusingly called emulators. You get a whole rack of FPGAs (or sometimes giant arrays of tiny little CPUs that only run boolean ops) connected to eachother, running at 10s of Mhz with the signals between them multiplexed.

snaky · on Oct 4, 2018

> we verify a fairly realistic implementation of a multicore shared-memory system with two types of components: memory system and processor. Both components include nontrivial optimizations, with the memory system employing an arbitrary hierarchy of cache nodes that communicate with each other concurrently, and with the processor doing speculative execution of many concurrent read operations. Nonetheless, we prove that the combined system implements sequential consistency. To our knowledge, our memory-system proof is the first machine verification of a cache-coherence protocol parameterized over an arbitrary cache hierarchy, and our full-system proof is the first machine verification of sequential consistency for a multicore hardware design that includes caches and speculative processors.

https://deepspec.org/entry/Paper/Modular.20Deductive.20Verif...

aseipp · on Oct 4, 2018

That paper doesn't have anything to do with what the original poster said, which is that extremely large scale VLSI-style designs don't use FPGAs, but instead use large-scale, extremely expensive and fast simulators instead. That's basically correct, regardless of the kinds of formal methods that are in play: simulation is vastly cheaper and more feasible at large scales, so much so that people instead buy hardware accelerators for simulation to improve development time and get reasonable turnaround on complex components.

Smaller-scale embedded systems or bespoke designs can get away with full FPGA flows in order to do low-scale deployment/development (e.g. SiFive's CPU cores all run this way, and you can run them) -- and pretty much all hobbyist CPUs do so. But to say all modern CPUs use FPGAs in their development cycle is overstating it a bit. There's simply no devices that exist the kind of logic density needed to realistically model many large scale, high-volume designs you see today, not to mention the unbelievable synthesis times you'd have to endure.

snaky · on Oct 4, 2018

> The scope of verifiable components is fundamentally limited by the state-space-explosion problem. For example, in order to verify cache-coherence protocols with Murphi [Dill et al.1992], a widely used model checker designed for that exact task, at one point the biggest system that could be verified had only 3 cores interacting with a single-address memory having two potential values. While improvements to algorithms and hardware (for running the analysis) continually increase the feasible system size, there are fundamental limitations of such tools that explore finite state spaces, in the face of exponential growth in state spaces as we add processor cores or memory addresses.

> In this paper, we introduce the Kami framework for the Coq proof assistant, which brings this style of development and verification to the world of computer architecture, simultaneously reversing all of the weaknesses enumerated above for most hardware-verification tools. Kami makes it possible to code and verify a fairly realistic processor within Coq and then extract it to run on FPGAs (and almost certainly in silicon, though we have not yet had the opportunity to fabricate a “real” chip). We also emphasize that our use of “verified” is more in the tradition of CompCert [Leroy 2006] or seL4 [Klein et al. 2009] than of the day-to-day practices of verification engineers in the hardware industry: we prove that a system with any number of shared-memory cores with associated coherent caches correctly implements the functional-correctness specification of our instruction set, with respect to sequential consistency.

https://deepspec.org/entry/Paper/Kami.3A+A+Platform+for+High...

zipcpu · on Oct 4, 2018

All of the ZipCPU's cache implementations have been formally verified using SymbiYosys--an open source, free-as-beer product.

slededit · on Oct 4, 2018

Most of a modern processor is cache. You could likely fit the actual core on a relatively large FPGA. The cache would be harder.