2. Backend abstracting over the cust, ash and wgpu crates
3. wgpu and co. abstracting over platforms, drivers and APIs
4. Vulkan, OpenGL, DX12 and Metal abstracting over platforms and drivers
5. Drivers abstracting over vendor specific hardware (one could argue there are more layers in here)
6. Hardware
That's a lot of hidden complexity, better hope one never needs to look under the lid. It's also questionable how well performance relevant platform specifics survive all these layers.
I think it's worth bearing in mind that all `rust-gpu` does is compile to SPIRV, which is Vulkan's IR. So in a sense layers 2. and 3. are optional, or at least parallel layers rather than accumulative.
And it's also worth remembering that all of Rust's tooling can be used for building its shaders; `cargo`, `cargo test`, `cargo clippy`, `rust-analyzer` (Rust's LSP server).
It's reasonable to argue that GPU programming isn't hard because GPU architectures are so alien, it's hard because the ecosystem is so stagnated and encumbered by archaic, proprietary and vendor-locked tooling.
Layers 2 and 3 are implementation specific and you can do it however you wish. The point is that a rust program is running on your GPU, whatever GPU. That’s amazing!
The demo is admittedly a rube goldberg machine, but that's because this was the first time it is possible. It will get more integrated over time. And just like normal rust code, you can make it as abstract or concrete as you want. But at least you have the tools to do so.
That's one of the nice things about the rust ecosystem, you can drill down and do what you want. There is std::arch, which is platform specific, there is asm support, you can do things like replace the allocator and panic handler, etc. And with features coming like externally implemented items, it will be even more flexible to target what layer of abstraction you want
> but that's because this was the first time it is possible
Using SPIRV as abstraction layer for GPU code across all 3D APIs is hardly a new thing (via SPIRVCross, Naga or Tint), and the LLVM SPIRV backend is also well established by now.
Those don't include CUDA and don't include the CPU host side AFAIK.
SPIR-V isn't the main abstraction layer here, Rust is. This is the first time it is possible for Rust host + device across all these platforms and OSes and device apis.
You could make an argument that CubeCL enabled something similar first, but it is more a DSL that looks like Rust rather than the Rust language proper(but still cool).
Correct. The new thing is those shaders/kernel also run via CUDA and on CPU unchanged. You could not do that with only wgpu...there is no rust shader input, (and the thing that enables it rust-gpu which is used here) and if you wrote your code in a shader lang it wouldn't run on the CPU (as they are made for GPU only) or via CUDA.
As far as I understand, there was a similar mess with CPUs some 50 years ago: All computers were different and there was no such thing as portable code. Then problem solvers came up with abstractions like the C programming language, allowing developers to write more or less the same code for different platforms. I suppose GPUs are slowly going through a similar process now that they're useful in many more domains than just graphics. I'm just spitballing.
The first portable programming language was, uh, Fortran. Indeed, by the time the Unix developers are thinking about porting to different platforms, there are already open source Fortran libraries for math routines (the antecedents of LAPACK). And not long afterwards, the developers of those libraries are going to get together and work out the necessary low-level kernel routines to get good performance on the most powerful hardware of the day--i.e., the BLAS interface that is still the foundation of modern HPC software almost 50 years later.
(One of the problems of C is that people have effectively erased pre-C programming languages from history.)
> I suppose GPUs are slowly going through a similar process now that they're useful in many more domains than just graphics.
I've been waiting for the G in GPU to be replaced with something else since the first CUDA releases. I honestly think that once we rename this tech, more people will learn to use it.
And yet, we are still using handwritten assembly for hot code paths. All these abstraction layers would need to be porous enough to allow per-device specific code.
> And yet, we are still using handwritten assembly for hot code paths
This is actually a win. It implies that abstractions have a negligible (that is, existing but so small that can be ignored) cost for anything other than small parts of the codebase.
Complexity is not inherently bad. Browsers are more or less exactly as complex as they need to be in order to allow users to browse the web with modern features while remaining competitive with other browsers.
This is Tesler's Law [0] at work. If you want to fully abstract away GPU compilation, it probably won't get dramatically simpler than this project.
> Complexity is not inherently bad. Browsers are more or less exactly as complex as they need to be in order to allow users to browse the web with modern features while remaining competitive with other browsers.
What a sad world we live in.
Your statement is technically true, the best kind of true…
If work went into standardising a better API than the DOM we might live in a world without hunger, where all our dreams could become reality. But this is what we have, a steaming pile of crap. But hey, at least it’s a standard steaming pile of crap that we can all rally around.
I hate it, but I hate it the least of all the options presented.
I swear, if I had 2c for every <insert computer guy surname> supposed "law", I would be a millionaire now. Slogans make no law, but programmers sure love "naming things." In all fairness, the obsessive elevating of good-sounding slogans into colloquial "laws" is a uniquely American phenomenon. My pet-theory is that this goes back to old days when computer science wasn't considered a "real" science. That is, in "real" sciences there are laws, so the American computer science guys felt like inventing "laws" to be taken seriously.
Realistically though, a user can only hope to operate at (3) or maybe (4). So not as much of an add. (Abstraction layers do not stop at 6, by the way, they keep going with firmware and microarchitecture implementing what you think of as the instruction set.)
You posted this comment in a browser on an operating system running on at least one CPU using microcode. There are more layers inside those (the OS alone contains a laundry list of abstractions). Three levels of abstractions can be fine.
Debugging the Rust is the easy part. I write vanilla CUDA code that integrates with Rust and that one is the hard part. Abstracting over the GPU backend w/ more Rust isn't a big deal, most of it's SPIR-V anyway. I'm planning to stick with vanilla CUDA integrating with Rust via FFI for now but I'm eyeing this project as it could give me some options for a more maintainable and testable stack.
Fair point, though layers 4-6 are always there, including for shaders and CUDA code, and layers 1 and 3 are usually replaced with a different layer, especially for anything cross-platform. So this Rust project might be adding a layer of abstraction, but probably only one-ish.
I work on layers 4-6 and I can confirm there’s a lot of hidden complexity in there. I’d say there are more than 3 layers there too. :P
That looks like the graphics stack of a modern game engine. Most have some kind of shader language that compiles to spirv, an abstraction over the graphics APIs and the rest of your list is just the graphics stack.
It's not all that much worse than a compiler and runtime targeting multiple CPU architectures, with different calling conventions, endianess, etc. and at the hardware level different firmware and microcode.
But that's not the fault of the new abstraction layers, it's the fault of the GPU industry and its outrageous refusal to coordinate on anything, at all, ever. Every generation of GPU from every vendor has its own toolchain, its own ideas about architecture, its own entirely hidden and undocumented set of quirks, its own secret sauce interfaces available only in its own incompatible development environment...
CPUs weren't like this. People figured out a basic model for programming them back in the 60's and everyone agreed that open docs and collabora-competing toolchains and environments were a good thing. But GPUs never got the memo, and things are a huge mess and remain so.
All the folks up here in the open source community can do is add abstraction layers, which is why we have thirty seven "shading languages" now.
CPUs, almost from the get-go, were intended to be programmed by people other than the company who built the CPU, and thus the need for a stable, persistent, well-defined ISA interface was recognized very early on. But for pretty much every other computer peripheral, the responsibility for the code running on those embedded processors has been with the hardware vendor, their responsibility ending at providing a system library interface. With literal decades of experience in an environment where they're freed from the burden of maintaining stable low-level details, all of these development groups have quite jealously guarded access to that low level and actively resist any attempts to push the interface layers lower.
As frustrating as it is, GPUs are actually the most open of the accelerator classes, since they've been forced to accept a layer like PTX or SPIR-V; trying to do that with other kinds of accelerators is really pulling teeth.
The fact that the upper parts of the stack are so commoditized (i.e. CUDA and WGSL do not in fact represent particularly different modes of computation, and of course the linked article shows that you can drive everything pretty well with scalar rust code) argues strongly against that. Things aren't incompatible because of innovation, they're incompatible because of expedience and paranoia.
Even games, epitome of performance, have 5 levels of abstraction (including your 4, 5, 6 + an engine layer + game code). This isn't new in GPU/Graphics programming IMHO.
> It's also questionable how well performance relevant platform specifics survive all these layers.
Fair point, but one of Rust's strengths is the many zero-cost abstractions it provides. And the article talks about how the code complies to the GPU-specific machine code or IR. Ultimately the efficiency and optimization abilities of that compiler is going to determine how well your code runs, just like any other compilation process.
This project doesn't even add that much. In "traditional" GPU code, you're still going to have:
1. Domain specific GPU code in whatever high-level language you've chosen to work in for the target you want to support. (Or more than one, if you need it, which isn't fun.)
...
3. Compiler that compiles your GPU code into whatever machine code or IR the GPU expects.
4. Vulkan, OpenGL, DX12 and Metal...
5. Drivers...
6. Hardware...
So yes, there's an extra layer here. But I think many developers will gladly take on that trade off for the ability to target so many software and hardware combinations in one codebase/binary. And hopefully as they polish the project, debugging issues will become more straightforward.
1. Domain specific Rust code
2. Backend abstracting over the cust, ash and wgpu crates
3. wgpu and co. abstracting over platforms, drivers and APIs
4. Vulkan, OpenGL, DX12 and Metal abstracting over platforms and drivers
5. Drivers abstracting over vendor specific hardware (one could argue there are more layers in here)
6. Hardware
That's a lot of hidden complexity, better hope one never needs to look under the lid. It's also questionable how well performance relevant platform specifics survive all these layers.