AI models are mostly matrix multiplications and have been that way for a few years now, which is longer than a hardware cycle. Moreover, if the structure changes then the hardware changes regardless of whether it's general purpose or not, because then it has to be optimized for the new structure.
Everybody cares about VRAM right now yet you can get a P40 with 24GB for 10% of the price of a 24GB RTX 4090. Why? No tensor cores, the things used for matrix multiplication.
Everybody cares about VRAM right now yet you can get a P40 with 24GB for 10% of the price of a 24GB RTX 4090. Why? No tensor cores, the things used for matrix multiplication.