SRAM has stopped scaling based on TSMC's upcoming N3E specs and their planned N2 node specs. So if models are tens of GB large, then I don't see how their proposed chips can be done in an economical way.
Also, a GPU is already an ASIC but with a fancy name.
Nowadays GPUs have sacrificed some performance for better programmability. ASICs always trade programmability for better performance and energy efficiency, it's really about how 'specific' you want it to be. I guess for applications as important and popular as LLM, we probably want a very 'specific' chip
Also, a GPU is already an ASIC but with a fancy name.