Im skeptical as well. The primary reason IMO is the software. How do you easily ...

nomercy400 · on Oct 27, 2020

It is doable. I've seen it during my Computer Engineering courses 14 years ago.

Basically you analyze the code for candidates, select a candidate, upload your custom hardware design, run your operation on the hardware, and repeat.

The difficult part is that uploading your hardware to FPGA is in the order of tenths of seconds, which is ages when compared to the nano and micro seconds your CPU works. So your specific operation must be worthwhile to upload.

A bit of FPGA on your CPU makes it more flexible, for example your could set a profile such as 'crypto' or 'video' to add some specific hardware acceleration to you general purpose CPU.

Imagine your CPU being able to switch your embedded GPU into another CPU core.

hajile · on Oct 27, 2020

Codecs are a great example.

Let's say the current zen 2 had an FPGA onboard. AMD could sell you an upgraded design with AV1 support for a few dollars. Most people aren't going to buy a new CPU on the basis of a video decoder, but they'll buy an upgrade to the chip that auto "installs" itself. That's a sale AMD otherwise wouldn't have made.

dboreham · on Oct 27, 2020

Except the new codec won't fit into the FPGA they put on that chip that's in the field.

eqvinox · on Oct 27, 2020

The codec is gonna get nowhere near to filling a "CPU-class" FPGA, so if anything you get fewer parallel instances of it.

Someone · on Oct 27, 2020

Also, for the way most modern CPUs are used: how do you task switch? If the hardware is large enough, you can deploy multiple configurations at a time, but does software support that? Is is possible to have relocatable configurations?

In theory, you could even page out code, but I guess the speed of that will be slow. Also, paging in probably would be challenging because the logical units aren’t uniform (if only because not all of them will be connected to external wires)

varispeed · on Oct 27, 2020

This can be used with a client-server model, that is if there are enough free cells and I/O available on FPGA it could let it install the configuration and then any application could communicate with it concurrently, maybe with some basic auth.

Someone · on Oct 27, 2020

But from what I understand of FPGAs, fragmentation would be a serious issue. You may have the free cells and I/O you need to implement some circuit, but if they’re dispersed over your FPGA or even connected, but in the wrong shape for the circuit you’re building, that’s useless.

An enormous crossbar could solve that, but I would think that would be way too costly, if practically possible at all.

rjsw · on Oct 27, 2020

You can reconfigure just part of the FPGA, it isn't used all that often though.

threatripper · on Oct 27, 2020

I would see it being used more like a GPU than a CPU.

gmueckl · on Oct 27, 2020

Even GPUs multitask all the time, even though it's less obvious. Cooperative multitasking in this context means setting up and executing different shaders/kernels. The overhead involved in this is quite manageable.

Repurposing FPGAs to different tasks means loading a new bitstream into the device every time. So it is much more efficient to grant exclusive access to each user of the device for long stretches od time. The proper pattern for that is more like a job queue.

dragontamer · on Oct 27, 2020

An actual GPU or CPU will always run circles around an FPGA CPU or FPGA GPU.

Where FPGAs win are new architectures, like Systolic engines. Entirely different computer designs from the ground up.

wtetzner · on Oct 27, 2020

I believe there is some amount of support in OpenCL for FPGAs. If only we could get companies to property support OpenCL, we'd have a nice software interface to pretty much any kind of compute resource on a machine.

SSLy · on Oct 27, 2020

My armchair amateur brain immediately thought about something CUDA-like.

numpad0 · on Oct 27, 2020

FPGA code takes hours to compile, yet product/model specific

simias · on Oct 27, 2020

You're not wrong but I expect they'd make it so that the various models would be similar enough (at least within a given CPU generation) so that you could use mostly precompiled artifacts instead of rerouting everything from scratch.

I've always been pretty skeptical of their approach though, in order to be usable they'd need excellent tooling to support the feature, and if there's one thing that existing FPGA software isn't it's "excellent".

Getting FPGAs to perform well is often an art more than a science ("hey guys, let's try a different seed to see if we get better timings") so the idea that non-hardware people would start to routinely generate FPGA bitstreams for their projects is so implausible that it's almost comical to me.

Maybe one day we'll have a GCC/LLVM for FPGAs and it'll be a different story.

pclmulqdq · on Oct 27, 2020

Beyond the GCC/LLVM, you also really need a standard library. Nobody is talking about that. Today, if you want a std::map on an FPGA, you have to either pay $100k or build it yourself. That's untenable.

FPGAhacker · on Oct 27, 2020

You would use precompiled modules or compositions of these modules (pipeline or parallel).

This can be a relatively fast operation. Seconds or less depending on complexity.