AMD has great hardware, but its software still leaves a lot to be desired for AMD to be a major AI hardware player. It takes years of unwavering leadership focus and hundreds of millions (probably billions) of dollars to get the software that works well for AI users.
The role the software played to get NVIDIA from a run-of-the-mill video card manufacturer to the top dog in AI hardware with 4T market cap is often underappreciated. My 2c.
> AMD has great hardware, but its software still leaves a lot to be desired for AMD to be a major AI hardware player. It takes years of unwavering leadership focus and hundreds of millions (probably billions) of dollars to get the software that works well for AI users.
It does but they have a capable CEO with a vision and broad support from the board - Ryzen was a decade long over night success.
Zen is a success. But Zen is hardware, and AMD is (historically) a hardware company. Delivering software is hard, even if you're a software company. I wouldn't take it as given that a good (hardware) CEO, vision, and board support are sufficient to build the required software organisation, especially given their track record on this front to date. It is more likely that Modular is AMD's software savior. I won't speculate on how probable that is.
AMD being able to benefit from AI, and this OpenAI relationship, is a bit different though. This is about using AMD hardware for training and presumably inference of LLMs. The users will be people consuming OpenAI APIs and services running on AMD hardware, not people themselves writing custom ML applications using AMD libraries.
Maybe also worth noting that some of the worlds largest supercomputers (e.g. Oak Ridge "Frontier" exascale computer) are based on AMD AI processors - I've no idea what drivers/libraries are being used to program these, but presumably they are reliable. I doubt they are using CUDA compatibility libraries.