Hacker News new | past | comments | ask | show | jobs | submit login

Isn't the main component for AI matrix multiplication? What makes it so hard to create a good alternative API for matrix multiplication?



It's a lot more complicated than just writing a matrix multiplication kernel because there are all sorts of operations you need to have on top of matrix multiplication (non linearities, various ways of manipulating the data) and this sort of effort is only really worthwhile if it's well optimized.

On top of that, AMD's compute stack is fairly immature, their OpenCL support is buggy and ROCm compiles device specific code, so it has very limited hardware support and is kind of unrealistic to distribute compiled binaries for. Then, getting to the optimization aspect, NVIDIA has many tools which provide detailed information on the GPU's behavior, making it much easier to identify bottlenecks and optimize. AMD is still working on these.

Finally, NVIDIA went out of its way to support ML applications. They provide a lot of their own tooling to make using them easier. AMD seems to have struggled on the "easier" part.


Well I think there are 2 types right ? Tensor cores (which afaik AMD dont have) which are better for matrix ops, and CUDO which are better for general parallel ops.

Maybe someone more clever than me can go into the specifics, I only understand the minimum of the low lvl GPU details.

Nice high lvl document

[0] https://www.acecloudhosting.com/blog/cuda-cores-vs-tensor-co...


I think API for matrix multiplication is just a part of the issue. CUDA tooling has better ergonomics, it's easier to set up and treated as first class citizen in tools like Tensorflow and Pytorch.

So, while I can't talk about the hardware differences in detail, developer experience is greatly on nVidia side and now AMD has a moat to overcome to catch up.


there is nccl, gpudirect, nvlink and so on and so forth.. It is not just matmul on gpus.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: