I suspect this article may even be underestimating the impact of WebGPU. I'll ma...

why_only_15 · on May 3, 2023

WebGPU has no equivalent to tensor cores to my understanding; are there plans to add something like this? Or would this be "implementation sees matmul-like code; replaces with tensor core instruction". For optimal performance, my understanding is that you need tight control of e.g. shared memory as well -- is that possible with WebGPU?

On NVIDIA GPUs, flops without tensor cores are ~1/10th flops with tensor cores, so this is a pretty big deal for inference and definitely for training.

raphlinus · on May 3, 2023

Shared memory, yes, with the goodies: atomics and barriers. We rely on that heavily in Vello, so we've pushed very hard on it. For example, WebGPU introduces the "workgroupUniformLoad" built-in, which lets you broadcast a value to all threads in the workgroup while not introducing potential unsafety.

Tensor cores: I can't say there are plans to add it, but it's certainly something I would like to see. You need subgroups in place first, and there's been quite a bit of discussion[1] on that as a likely extension post-1.0.

[1]: https://github.com/gpuweb/gpuweb/issues/3950

jdashg · on May 4, 2023

Yes, we expect to have a natural path towards explicit cooperative matrix multiply ops.

If you have a wishlist, we have an issue tracker! ;) https://github.com/gpuweb/gpuweb/issues

dwheeler · on May 3, 2023

I haven't tried it myself, but it looks like several are already looking at implementing machine learning with WebGPU, and that this is one of the goals of WebGPU. Some info I found:

* "WebGPU powered machine learning in the browser with Apache TVM" - https://octoml.ai/blog/webgpu-powered-machine-learning-in-th...

* "Fastest DNN Execution Framework on Web Browser" https://mil-tokyo.github.io/webdnn/

* "Google builds WebGPU into Chrome to speed up rendering and AI tasks" https://siliconangle.com/2023/04/07/google-builds-webgpu-chr...

wingman-jr · on May 4, 2023

Also, Tensorflow.js WebGPU backend has been in the works for quite some time: https://github.com/tensorflow/tfjs/tree/master/tfjs-backend-...

chrisjc · on May 3, 2023

This is the discussion I hoped to find when clicking on the comments.

> Fairly soon, WebGPU will be an alternative...

So while the blog focused on the graphical utility of WebGPU, the underlying implementation of WebGPU is currently about the way that websites/apps can now interface with the GPU in a more direct/advantageous way to render graphics.

But what you're suggesting is that in the future new functionality will likely be added to take advantage of your GPU in other ways, such as training ML models and then using them via an inference engine all powered by your local GPU?

Is the reason you can't accomplish that today bc APIs haven't been created or opened up to allow such workloads? Are there not lower level APIs available/exposed today in WebGPU that would allow developers to begin the design of browser based ML frameworks/libraries?

Was it possible to interact with the GPU before WebGPU via Web Assembly?

Other than ML and graphics/games (and someone is probably going to mention crypto), are there any other potentially novel uses for WebGPU?

raphlinus · on May 3, 2023

> ...in the future new functionality will likely be added to take advantage of your GPU in other ways, such as training ML models and then using them via an inference engine all powered by your local GPU?

Yes.

> Is the reason you can't accomplish that today bc APIs haven't been created or opened up to allow such workloads? Are there not lower level APIs available/exposed today in WebGPU that would allow developers to begin the design of browser based ML frameworks/libraries?

That is correct, there is no way before WebGPU to access compute capability of GPU hardware through the Web. There have been some hacks based on WebGL, but those are seriously limited. The fragmentation of the existing API space is a major reason we haven't seen as much progress on this.

> Was it possible to interact with the GPU before WebGPU via Web Assembly?

Only in limited ways through WebGL - no access to workgroup shared memory, ability to do random access writes to storage buffers, etc.

> Other than ML and graphics/games (and someone is probably going to mention crypto), are there any other potentially novel uses for WebGPU?

Yes! There is research on doing parallel compilers on GPU (Aaron Hsu's co-dfns as well as Voetter's work[1]). There's quite a bit of work on implementing Fourier transforms at extremely high throughput. Obviously, physics simulations and other scientific workloads are a good fit. To me, it feels like things are wide open.

[1]: https://dl.acm.org/doi/pdf/10.1145/3528416.3530249

flohofwoe · on May 3, 2023

> Was it possible to interact with the GPU before WebGPU via Web Assembly?

Only with tons of restrictions by going through WebGL, or by writing your own or extending an existing WASM runtime outside the browser which connects to system-native GPU APIs like D3D12, Vulkan or Metal.

> are there any other potentially novel uses for WebGPU

In general, the only thing that WebGPU has over calling into D3D12, Vulkan or Metal directly is that it provides a common subset of those 3 APIs (so you don't need to write 2 or 3 implementations to cover all popular operating systems) and it's easier to use (at least compared to D3D12 and Vulkan).

WebGPU is an important step into the right direction, but not a 'game changer' per se (at least outside the browser).

ianpurton · on May 3, 2023

> GPU in other ways, such as training ML models and then using them via an inference engine all powered by your local GPU?

Have a look at wonnix https://github.com/webonnx/wonnx

A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the web

tikkun · on May 3, 2023

Could someone explain what kinds of useful things will become possible with it?

I don't get it yet, but HN seems excited about it, so I'd like to understand it.

What I get so far - running models that can be run on consumer sized GPUs will become easier, because users won't need to download a desktop app to do so. This is limited for now by the lack of useful models that can be run on consumer GPUs, but we'll get smaller models in the future. And it'll be easier to make visualizations, games, and VR apps in the browser. Is that right and what other popular use cases where people currently have to resort to WebGL or building a desktop app will get easier that I'm missing?

sebzim4500 · on May 3, 2023

It's not so much that people want to run the models in the browser. They want to be able to write and publish one desktop app that e.g. runs a LLaMA quality model and runs decently across the hundreds of different GPUs that exist on their users' machines.

tikkun · on May 3, 2023

So it's more of a CUDA alternative for building desktop apps that are GPU powered?

dm33tri · on May 3, 2023

Nowhere near CUDA. Maybe OpenCL and Metal replacement because nobody bothers to support them, so just a fallback option for AMD and ARM chips.

tikkun · on May 3, 2023

So then:

If your app needs CUDA, you'd need to write it in CUDA.

If you don't need CUDA, you'd write it for WebGPU instead?

If you benefit from CUDA but don't strictly require it, then you can write it in CUDA with a WebGPU fallback.

Is that right?

dm33tri · on May 4, 2023

I think you only write WebGPU code if you are using JS or Rust for your hobby project

tormeh · on May 3, 2023

> because users won't need to download a desktop app to do so

Not only that, WebGPU is OS and hardware neutral. It'll work regardless of what machine you have. Currently it's Nvidia or get lost.

Conscat · on May 3, 2023

If this is something users actually care about, wouldn't Vulkan AI be a big deal? It has way more features for AI than WebGPU will likely ever have, and it's widely available today.

rodgerd · on May 4, 2023

The blog spends a lot of time pointing out that it is not, in fact, widely available today in any practical sense.

Conscat · on May 4, 2023

It spends a little time on the topic, and imho doesn't argue the point very well.

It points out that MoltenVK doesn't work very well, and I agree, although I think it's questionable whether the Molten-compatible subset is worse than WebGPU or not.

It says that Vulkan has driver issues on Linuxes, but provides no citations for that, and in my experiences, Vulkan drivers and developer tools have been the best for Linux, overall (my experience is limited to Nvidia, so maybe that's why, Idk).

p0nce · on May 3, 2023

HN was similarly excited about Vulkan, but this is newer.

sebzim4500 · on May 3, 2023

This fixes the main problem with Vulkan, which is that there were no big tech companies pushing them. WebGPU has Apple, Google and Microsoft all committing to support it in their browser/OS.

kllrnohj · on May 3, 2023

Not quite. The main problem with Vulkan is, as the blog post goes into, one of usability. Vulkan isn't designed for end developers to use, it's designed for middleware vendors to use. Vulkan is already being pushed by many major big tech companies include Google, Nvidia, AMD, etc... It's really only Apple that's a problem here.

WebGPU is basically a middleware for those that just want to use graphics, not all of unity, unreal, etc...

pjmlp · on May 4, 2023

There is no Vulkan on game consoles, those little devices HN keeps forgeting about.

Yes, the Switch supports Vulkan alongside OpenGL 4.6, but if one wants the full power of the games console, the name of the game is NVN.

As for Apple, all relevant middleware has support for Metal.

connicpu · on May 3, 2023

The Vulkan situation on Apple is incredibly funny, especially once you look at the landscape of gaming. There's games out there (I believe Final Fantasy XIV is like this) where the official Mac client runs in wine, using a DirectX -> Vulkan translation layer, on top of a Vulkan -> Metal translation layer.

pjmlp · on May 4, 2023

One can go that way, or use one of the middleware engines that has support for Metal for several years now.

taeric · on May 4, 2023

I'm not clear this is accurate. More, I'm not sure these are the right companies to push?

Many companies have tried to unseat Nvidia. Many of them big companies. There is no magic in the ones you named. Worse, I'm not clear that Google will bring stability to the project. Such that I see little reason to be overly excited.

Happy to be proven wrong.

Gordonjcp · on May 3, 2023

> unless you buy into the Nvidia / CUDA ecosystem

Coming at it from a graphics processing perspective, working on a lot of video editing, it's annoying that just as GPUs start to become affordable as people turn their back on cryptobro idiocy and stop chasing the Dunning-Krugerrand, they've started to get expensive again because people want hardware-accelerated Eliza chatbots.

Anyway your choices for GPU computing are OpenCL and CUDA.

If you write your project in CUDA, you'll wish you'd used OpenCL.

If you write your project in OpenCL, you'll wish you'd used CUDA.

singhrac · on May 3, 2023

I've never once regretted my decision to write GPU code in CUDA... I mean, I wish there were alternatives because being locked into Nvidia isn't fun, but CUDA is a great developer experience.

pjmlp · on May 4, 2023

> If you write your project in CUDA, you'll wish you'd used OpenCL.

Only if one is stuck in a C for GPGPUs mindset.

garbagecoder · on May 3, 2023

>the Dunning-Krugerrand

Stolen. Genius. You made my day.

Gordonjcp · on May 3, 2023

I stole it from someone else first, and I'm sorry to say I can't remember who.

Credit where it's due, though, it's an awesome term.

fanf2 · on May 3, 2023

I think I first heard it from @cstross

MobiusHorizons · on May 4, 2023

spir-v seems to be a real option anywhere with vulkan available

flockonus · on May 4, 2023

The thought also come to mind, but after listening the work of Neural Magic At Practical AI [1], and how the work with model quantization over CPU is advancing by leaps and bounds, I don't foresee the strong dependence we have on CUDA persisting, even in the near future.

https://twitter.com/neuralmagic/status/1653774178099625986

kragen · on May 4, 2023

Is Vello called that because 2-D graphics are very difficult, i.e., "hairy"?