Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anyone following geohots current tinygrad struggles has seen this proven right in front of them. AMD gpus are practically unusable for any serious ML work and he had to learn it the hard way, having dropped 100K into AMD gpus, assuming the drivers would work and if not, he even personally offered to fix them.


Not just geohot but AMD has a really amazing opportunity here to work with many other talented engineers that are more than willing to put resources into this if they had good documentation and tools to work with, it's really puzzling to me that AMD isn't taking this opportunity more seriously.


A shortened version of what was happening: the firmware / hardware is so bad that instead of fixing it the AMD team just added some restarts when the AMD card locks up (which happens all the time with computatiins), and even those restarts don't work, the whole computer had to be restarted.

This serious bug was open since May and AMD doesn't seem to respond as seriously as it should be.


Is this the same geohot that 9+ months ago declared he was "done with AMD"?

Isn't geohot infamous for stealing other people's work?

PBCAK?

That said, ROCm only officially supports a fraction of its product line, and an odd smattering throughout at that. It's a joke compared to CUDA which will run on damn near anything. And AMD has a long, long history of dogshit drivers (at least on Windows.)

AMD just doesn't seem to give enough of a shit to invest money into securing top talent for this, and NVIDIA will continue to stomp them.


Yeah, the same guy that was going to single handedly fix Twitter's search for Elon and resigned after 4 weeks saying there was nothing he could do.


> Isn't geohot infamous for stealing other people's work?

Are you meaning the Sony Playstation hacking where they took legal action against him, or are you meaning other stuff?


In lieu of a real answer, here's more HN conjecture:

https://news.ycombinator.com/item?id=30740509


That same bug is still open, not fixed. Azure announced access to AMD GPU cloud with NDA, but the cards are unusable for compute work as they lock up randomly.


I saw him on Twitch today in passing, the title was about "ripping <something> out of AMD drivers" or similar, so it seems he's still at it.


AMD is a deeply unserious company. They could have made boatload of money for shareholders like Nvidia did, but the AMD management looks very bad to me.

Shareholders of AMD should look into it and do some firings of top Executives/CEO until morale improves.


I think the problem AMD has is that they just don't have enough engineers and can't hire more because nvidia (an to a lesser extent Apple, AWS, Google and Microsoft) just gobbles up all people who have any experience with this sort of thing.

A long time ago AMD decided to 100% focus on budget consumer graphics (including consoles), that decision was the right decision at the time. However being in low-margin business it seems they don't have the people (or the budget to last-minute hire) to pump out the R&D for a generic neural network platform without moving people away from their consumer graphics division.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: