I strongly suspect that this is Nvidia's next big move. A giant CPU/GPU hybrid with unified memory. Who doesn't want threadripper scale Nvidia/ARM cores?
Yes, unified memory being the key here. FAFAIK GPUs still use a hub-and-spoke model, where all processing is done in the hub but data is stored on the rim. They achieve their speedup by increasing the number of spokes (data channels). You could do better by moving the data processing to the spokes as well, and using the hub only for orchestration and synchronization.
At least, it's computationally better. But it won't allow scaling the processing power and memory size independently, so it might require a different commercial model.