SRAM has stopped scaling based on TSMC's upcoming N3E specs and their planned N2...

punkgenius · on July 12, 2023

Nowadays GPUs have sacrificed some performance for better programmability. ASICs always trade programmability for better performance and energy efficiency, it's really about how 'specific' you want it to be. I guess for applications as important and popular as LLM, we probably want a very 'specific' chip

kraken12 · on July 12, 2023

Maybe they could do something like AMD's GPU memory stacking, that is good for scaling, and of course they are using many chips not one chip..