Turning .NET assemblies into FPGA hardware

Traster · on Dec 6, 2019

FPGAs are fundamentally a dataflow system. You have geographically distributed compute units, with reconfigurable routing. CPUs (obviously) work by bringing the data to a single fixed central processing unit. So the key question for all these software->FPGA tools, is how they handle this transformation. The problem is that often the way you layout your program is introducing implicit constraints that prevent optimization that aren't real but simply an artefact of how you write software.

The thing is, I don't believe that transformation is a solvable problem (not least because really clever people keep trying and keep releasing terrible attempts). In real life, people program GPUs using special constructs and instructions. Hell, they use special instructions for the vector operations on CPUs. So why are we pretending you can write generic code in languages designed for CPUs and have them run performantly on FPGA? What you end up with a tiny subset of the language that maps to already existing hardware languages with a little bit of syntactic sugar and a boat load of misunderstanding of the limitations.

SlowRobotAhead · on Dec 7, 2019

As to the optimization and the user not being able to and/or not having the tools to do it.

This was one of my favorite uses for ML. However the specific result i remember wasn’t that useable. The short version was they asked a model to program this FPGA to make a 1khz sin wave on a pin hooked to a speaker by listening to the result. It eventually worked, and the FPGA did indeed output a 1KHz wave. But... when the logic gates set in the FPGA were examined it was nonsense. And it wouldn’t work on any other chip. It was using defects in the silicon and was truly a one off success.

Obviously this optimization is only possible for a machine but yes, the tools from high level to bitstream could be a ton better.

zoltanlehoczky · on Dec 7, 2019

The transformation is surely a solvable problem given that various HLS tools, OpenCL for FPGAs and Hastlayer exist :). Whether it makes sense is another question and I don't pretend to have the definitive answer, though I believe it does: abstractions can increase productivity with trade-offs like the runtime efficiency of the implementation.

In this case I think in certain use-cases it has value to use the standard .NET parallel programming model and development workflow to get not an optimal but a good enough hardware implementation. The point is to get a performance and power efficiency increase, and with Hastlayer this is possible for certain algorithms with the fraction of the knowledge and time (i.e. cost) required to even begin being productive with HDL languages. Whether this is the best approach for a given task depends on its context.

We'll see, and in the meantime we keep working on experimental approaches like Hastlayer. Feedback like yours really helps, so thank you!

bsder · on Dec 6, 2019

> So why are we pretending you can write generic code in languages designed for CPUs and have them run performantly on FPGA?

Because software people have never paid a cost for state in their programming.

Software folks are so poor at managing state that some clever people actually built entire languages around that fact (garbage collection).

I'm optimistic though--Rust is really the first example of "Managing state is pain in the ass--maybe we need to rethink this" that exists in programming languages. I suspect more are to follow.

The_rationalist · on Dec 6, 2019

Rust is really the first example of "Managing state is pain in the ass--maybe we need to rethink this" There was Ada decades ago. I could also say react... The word that you use: state is too vague to my taste.

bsder · on Dec 7, 2019

> There was Ada decades ago.

Which then morphed into VHDL on hardware. So, fair point.

I really don't understand why Ada never caught on and I don't understand why it isn't getting a renaissance.

> The word that you use: state is too vague to my taste.

I chose "state" explicitly because it encompasses volatile memory, data persistence (file systems and the like), connections between things, etc.

All of these things have a significant cost when you are designing hardware, memory is often flip flops and highly constrained, data persistence requires a LOT of abstraction to use, connections between things cost time, possibly pins (a limited resource) and possibly an interface specification.

Hardware designers suffer real, genuine pain when a new feature needs to appear--can that fit on the chip, will that affect maximum frequency, did the cost just go up, and only then how long will that take to design and debug.

pjmlp · on Dec 7, 2019

Ada never caught on for several reasons:

The compilers were priced in values totally out of reach for most mortals, which ended up buying Modula-2 or Pascal compilers instead.

On UNIX systems, it meant buying an additional compiler beyond the C one that was already on the box. When UNIX SDKs became commercial, you had to buy two (UNIX SDK + Ada), which was even less appealing.

C++ was still coming into the world, and most compiler vendors though that Ada was impossible to be fully implementable and thus never bothered. Ironically, Ada compilers are much simpler than C++ ones.

Finally nowadays from around 6 surviving compiler vendors, only GNAT is community friendly (in spite of the usual license discussions), the remaining ones keep selling their compilers at enterprise prices (talk first with the sales team way).

So it is hard to get mass adoption this way.

caspper69 · on Dec 7, 2019

Ada would do better if they adopted a cleaner and more c-like syntax. It's just ugly, and people don't want to write it. But it is awesome!

jacobush · on Dec 7, 2019

Ehm, functional languages do too by avoiding state as much as possible. See https://clash-lang.org

bsder · on Dec 7, 2019

Functional languages somewhat avoid shared or implicit state by making you pass everything in and then create something new on the way out.

That's a LOT of state even if it isn't shared. At some point, what went in and what went out needed to get stored by somebody.

Thus leading to the old Perlis adage "LISP programmers know the value of everything and the cost of nothing."

msla · on Dec 7, 2019

> Functional languages somewhat avoid shared or implicit state by making you pass everything in and then create something new on the way out.

Dataflow, in other words. You pass data through functions, and you know the whole thing works because the functions are composable because they aren't doing anything off to the side. No side-effects means you can think in a pipeline data flows through.

Spreadsheets achieve this quite nicely, and are the only successful visual programming language (pacem Microsoft's use of "Visual" as a marketing term for their decidedly text-based Visual Basic and Visual C++) in addition to being the forgotten programming paradigm, or at least the one most likely to be pointedly ignored.

Some of this might even have relevance for creating FPGAs based on dataflow models.

nikofeyn · on Dec 6, 2019

i wouldn't really call this programming FPGAs with .NET languages. it's more just converting .NET code to run on FPGAs. those are two different things.

also, i don't really understand how moving something that is compute bound down to an FPGA is what you do. FPGAs are slow compared to CPUs. where they help is if you have something that can be parallelized and/or you want it to happen in a real-time manner.

it would be a big win if there was a more modern HDL language. VHDL actually has a lot of nice features. however, i think a cross with it and an ML language (like F# or OCaml) would be a wonderful fit for programming FPGAs, that is designing (or at least dictating) hardware with software.

Traster · on Dec 6, 2019

I don't think the languages are the real problem. It's the tooling that sucks. The fact that simulation generally comes from a different vendor to the synthesis means that the simulation environment will implement the behaviour of the language, not the behaviour of the target device. This means you can never verify your design without a full hardware test - at which point you can't probe signals. That could be fixed if someone simulated devices accurately rather than simulating the specification of the language. This is exacerbated by synthesis warnings that range from "I one hot encoded your state machine as has been common since 1984" to "Yo dawg you forgot to reset this signal, so the set and reset priority are unclear, I'm gunna just fuck the whole of your design for ya". We all enjoy those post-synthesis log parsing scripts! By the way - they can't fix the logging because Cisco/Nokia/DoD have spent decades parsing those logs and if you change them you'll break their workaround.

Secondly, because the people who understand FPGAs are hardware engineers the tool vendors hire hardware engineers (and software engineers who are willing to tolerate hardware engineers) to write their software, the result is tools written using the development practices of the dark ages.

End of rant.

willis936 · on Dec 7, 2019

In defense of the poor hardware engineer: HDLs are not like other programming languages. You are literally writing a netlist of a circuit. Hardware engineers are hired to write FPGA code because it is a hardware design problem, not a software one.

rowanG077 · on Dec 7, 2019

I really disagree with this. HDLs are exactly like other programming languages. What in your mind makes HDLs fundamentally different to software languages? Your point about netlists doesn't convince me at all since you aren't creating netlists with a HDL. A description written in a HDL can be turned into a netlist.

willis936 · on Dec 7, 2019

Have you ever worked with HDLs before? The thing you are working on Is a netlist, not abstract code. As the name implies, you are describing hardware rather than abstract software. The layers of abstraction are ones that are built by the programmer (or defined in hardware block libraries) rather than ones from a language. HDLs do not have a single entry point. Every line of code runs concurrently with every other line of code. It simply is not like typical programming.

rowanG077 · on Dec 7, 2019

Yes I have used HDLs. You are working on a creating a hardware configuration. But you are not writing a netlist. That is like saying when you are writing a software program that you are "working on the machine code". The only difference between a HDL and "software" languages is a different set of restrictions. Every software language/framework or whatever also has unique restrictions. The restrictions HDLs have are not in any significant way tangibly different then those that exist in the software land.

You have software languages as well where "every line of code runs concurrently". Any dataflow framework for instance. That's not even to name functional languages where evaluation of statements can be done completely in parallel based on data dependencies.

Software is a lot more then the old c style run a program line by line until it hits the end of main().

zoltanlehoczky · on Dec 7, 2019

I agree with the tooling issue. One of the reason we started Hastlayer was the viewpoint that "I'm a .NET developer with Visual Studio and everything else. Why would I want to use this thing created by Xilinx/Altera (Intel)?"

vmchale · on Dec 6, 2019

> i think a cross with it and an ML language (like F# or OCaml) would be a wonderful fit for programming FPGAs

Sounds like Clash! It's pseudo-Haskell that compiles to Verilog or VHDL.

dhash · on Dec 7, 2019

Or Chisel (Scala), which has real, industrial users and a thriving commmunity. RISC-V is in Chisel.

If you want to go really bonkers, there’s spatial-lang which allows easy interop with the Java memory model and heap

vmchale · on Dec 9, 2019

I believe Clash has industrial users as well.

There are a few businesses that use Haskell to target FPGAs.

zoltanlehoczky · on Dec 7, 2019

Yes, it's converting .NET code to run on FPGAs by creating an equivalent hardware design to the logic contained in the .NET code.

And yes, it's all about parallelization. As the docs also say, this only makes sense if your algorithm is embarrassingly parallel. If you check out the examples are like this too (within the limitations of a small FPGA).

BTW "programming" an FPGA in my understanding is the process of flashing the bitstream onto the board.

rowanG077 · on Dec 6, 2019

Please take a look at clash[1]. You can design digital circuits using Haskell and all the fancy language features Haskell has. It doesn't have a story for interop between the software world though. I love using it!

[1] https://clash-lang.org/

nikofeyn · on Dec 6, 2019

i've heard of it before. i'll be honest: i don't much like haskell. for example, this makes me crazy (haha):

  mmult mA mB = result
    where
      mBT      = transpose mB
      dot a b  = sum $ zipWith (*) a b
      result   = map (\ar -> dot ar <$> mBT) mA

either way, this is along the right lines. however, what FPGAs can you target? any xilinx chips?

it's really gonna take xilinx or altera to innovate in this space, but that is unlikely any time soon.

jdsully · on Dec 6, 2019

Xilinx is all in on high level synthesis. Although they are focusing on C++ and python instead of functional languages.

vmchale · on Dec 9, 2019

I'm kind of surprised. I get that Haskell/ML may not be the best-known, but even an original functional language is better than spamming pragmas on for loops

vmchale · on Dec 9, 2019

> it's really gonna take xilinx or altera to innovate in this space, but that is unlikely any time soon.

I think people already use Clash/Haskell to write HDL.

rowanG077 · on Dec 6, 2019

You can target any FPGA. Clash compiles to VHDL or Verilog.

SlowRobotAhead · on Dec 6, 2019

From a correctness point of view you don’t program for FPGAs at all. You configure them.

So, other than a headline I’m not seeing any advantage over Verilog or VHDL to consider this at all. I think there is a much larger issue in people not understanding FPGAs than the code used to configure them.

tverbeure · on Dec 6, 2019

"From a correctness point of view you don’t program for FPGAs at all. You configure them."

You program what needs to go into the FPGA by writing RTL, then you synthesizes the RTL, then you configure them at powerup time.

This subject of this topic is about using .NET as an alternative to traditional RTL languages. That's programming. If you're going to call that configuration, you're overloading a term "configuration" in a way that nobody in the FPGA industry ever does.

SlowRobotAhead · on Dec 7, 2019

I disagree.

Programming is setting the knobs on the machine, working with an existing structure to get a job done.

Configuring an FPGA is laying out the machine. You are designing the hardware gates and building the structure.

You could argue it’s pedantic, sure. But I think that’s 90% of the reason people struggle with FPGA is they look at it like programming and not hardware design. I was explained this difference by people in the FPGA reverse engineering industry.

So, no to the topic, I still see zero reason to use a .NET programming language to shoehorn how you want mostly non-procedural hardware logic to function. Other than to say you did it and write an article about it.

tverbeure · on Dec 8, 2019

When you program in C, the knobs that you’re setting are the bits of a file or flash memory.

When your program in Verilog, the knobs are the bits of a bunch of LUTs.

Insisting to call it configuration is not only pedantic. It is also not used this way by anybody else. And it’s confusing because it overloads a term that is used universally by all FPGA tools.

It’s hard to see how that makes it any different than just wrong.

Imagine the following conversation in isolation:

“Hey John, what are doing today?” - “I’m configuring my FPGA!”

Ask anyone in the field what is being described above. I guarantee you that nobody would answer that John is writing RTL.

eyegor · on Dec 6, 2019

I love playing with fpgas, but the compile time and size limitations are always horrendous or simply not tenable for most applications. This library narrows that down to be even worse [0]. On most of alteras chips you'd be hard pressed to fit a few of their own filter libraries, let alone your own code. Honestly I'm not sure what you would use a tool like this for. If you want asic development, you will likely need to use an hdl for better synthesis. If you want dot net languages, you are going to want a regular cpu.

Can someone tell me what this would actually be used for? I'm sitting here scratching my head.

[0] > Currently only the Nexys 4 DDR board (which is NOT the same as the non-DDR Nexys 4, be sure to purchase the linked board!) is supported, so you'll need to purchase one. Note that this is a relatively low-end development board that can't fit huge algorithms and it only supports slow communication channels. So with this board Hastlayer is only suitable for simpler algorithms that only need to exchange small amount of data.

zoltanlehoczky · on Dec 7, 2019

The approach here is to use FPGAs as compute accelerators, much like GPGPUs, and in somewhat similar use-cases. I.e. if you have an embarrassingly parallel compute-bound algorithm then it might make sense to offload it to FPGAs to gain performance and power efficiency.

Keep in mind that the target audience is not hardware engineers but .NET software developers. If you know what Altera libraries are then you're not the target audience :).

zoltanlehoczky · on Dec 7, 2019

And regarding Nexys: now we actually support Microsoft Catapult FPGAs as well and working on others too. It's quite challenging to add support for each of them, since as I'm sure you're aware, HLD is nowhere near as portable, especially between manufacturers, than software. Also, while if you know an HDL like VHDL then you can write code for all FPGAs, but keep in mind that you also have to write code compatible with each FPGA specifically (with major differences being at least between product families). With a high-level approach like Hastlayer, you get multi-FPGA support for free, i.e. your code won't change. It narrows your options hardware-wise but it's much simpler. It's a question of trade-offs and depends on what is most suitable for you.

Regarding size limitations: check out the samples on what you can fit on a low-end FPGA with Hastlayer, it's not that small. FPGAs that are more suitable as actual compute accelerators have 5-10x the logic resources (and much more) than the Nexys in question.

vmchale · on Dec 9, 2019

> in somewhat similar use-cases. I.e. if you have an embarrassingly parallel compute-bound algorithm then it might make sense to offload it to FPGAs to gain performance and power efficiency.

Also concurrency! GPU is SIMD; FPGA need not be.

zoltanlehoczky · on Dec 10, 2019

Yes, FPGA "threads" much like CPU threads can contain complex logic, and different from each other.

aswanson · on Dec 7, 2019

Hardware like fpgas peaked in the 90s as a value proposition. The future is algorithmic and abstract from hardware.

aswanson · on Dec 7, 2019

Overpriced defense contracts. Other than that, nothing.

dataflow · on Dec 6, 2019

There's also something called ARTIQ which I'm told allows high-level programming of FPGAs as well. I'm not familiar with it but it may be interesting for others: https://m-labs.hk/artiq/manual/introduction.html

fsh · on Dec 7, 2019

ARTIQ is a framework for real-time control of atomic physics experiments. Probably you were thinking of migen (https://github.com/m-labs/migen) which is used to implement ARTIQ.

phendrenad2 · on Dec 7, 2019

Note that this github project is only the PC host code for your FPGA. The actual .NET-to-FPGA translation code isn't open-sourced.