Hacker News new | past | comments | ask | show | jobs | submit login

There is definitely at least a performance hit in that wgpu (and I think WebGPU too) only supports a single queue. That means you can't asynchronously run compute tasks while running render tasks.

Additionally Wgpu (the library) will insert fences between all passes that have a read-write dependency on a binding, even if there is technically no fence needed as 2 passes might not access the same indices.

Finally I know that there is an algorithm called decoupled look back that can speed up prefix sums, but it requires a forward-progress guarantee. All recent NVIDIA cards can run it but I don't think AMD can, so WebGPU can't in general. Raph Levien has a blog post on the subject https://raphlinus.github.io/gpu/2021/11/17/prefix-sum-portab...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: