I strongly suspect that this is Nvidia's next big move. A giant CPU/GPU hybrid w...

tremon · on April 9, 2021

Yes, unified memory being the key here. FAFAIK GPUs still use a hub-and-spoke model, where all processing is done in the hub but data is stored on the rim. They achieve their speedup by increasing the number of spokes (data channels). You could do better by moving the data processing to the spokes as well, and using the hub only for orchestration and synchronization.

At least, it's computationally better. But it won't allow scaling the processing power and memory size independently, so it might require a different commercial model.