GPUs aren't going to _replace_ CPUs any time soon because GPUs aren't capable nor designed for general-purpose processing. It's a fundamentally REALLY HARD problem to use GPUs to speed up any arbitrary computations (solve it and you'll be a shoo-in for the Turing Award).
Businesses aren't going to be "moving" to GPUs. The "age of NVIDIA" is primarily predicated on its role in accelerating training the machine learning algorithms hyped up as "deep learning."
And AMD still very much has an uphill battle on both the CPU and the GPU fronts.
But yes, the excitement around Intel that used to be there is now gone. You can probably blame the "death of the PC" -- we teach kids coding in the Bay Area, and every single kid has an iPad tablet, but not a laptop. Sure professional engineers like us still have x86 laptops, but the average person does not.
It's not just about accelerating ML, specifically deep learning. There are many other enterprise technologies that can benefit from GPUs. One example: OLAP-focused databases (such as MapD - https://www.mapd.com/). For some benchmarks, check out this blog: http://tech.marksblogg.com/benchmarks.html.
The DL "training" use-case is well-known at this point, but there are many others which are emerging.
A GPU database isn't that useful, because the arithmetic intensity (ops/byte) is relatively low. Cross-sectional memory bandwidth is what really matters; you can get similar effects with a cluster of CPU machines provisioned appropriately, with a shard or a replica of the database on each CPU machine. I say this as someone who has written a GPU in-memory database of sorts that is used at Facebook (Faiss), but what is interesting if you can tie that to something that has higher arithmetic intensity before or after the database lookup on the GPU.
GPUs are only really being used for machine learning due to the sequential dependence of SGD and the relatively high arithmetic intensity (flops/byte) of convolutions or certain GEMMs. The faster you can take a gradient descent step means the faster wall clock time to converge, and you would lose by limiting memory reuse (for conv/GEMM) or on communication overhead or latency if you attempt to split a single computation between multiple nodes. The Volta "tensor cores" (fp16 units) make the GPU less arithmetic bound for operations such as convolution that require a GEMM-like operation, but the fact that the memory bandwidth did not increase by a similar factor means that Volta is fairly unbalanced.
The point about Intel not increasing their headline performance by as much as GPUs is also misleading. Intel CPUs are very good at branchy codes and are latency optimized, not throughput optimized (as far as a general purpose computer can be). Not everything we want to do, even in deep learning, will necessarily run well on a throughput-optimized machine.
Actually, in columnar databases the ops/byte intensity is significantly greater, and the GPU helps here.
If you think about how a database CAN be built, instead of how they were built until now, you will find that there are very interesting ideas that can and do make use of the GPU.
The research into these has been around since 2006, with a lot of interesting papers published around 2008-2010.
There are also at least 5 different GPU databases around, each with their own aspects and suitable use-cases [1]...
I think they actually mean CPU = Intel, GPU = NVIDIA because strictly speaking Intel is also a GPU manufacturer, albeit low end GPUs. NVIDIA will play a major role in AI and simulation, that is clear.
GPUs aren't going to _replace_ CPUs any time soon because GPUs aren't capable nor designed for general-purpose processing. It's a fundamentally REALLY HARD problem to use GPUs to speed up any arbitrary computations (solve it and you'll be a shoo-in for the Turing Award).
Businesses aren't going to be "moving" to GPUs. The "age of NVIDIA" is primarily predicated on its role in accelerating training the machine learning algorithms hyped up as "deep learning."
And AMD still very much has an uphill battle on both the CPU and the GPU fronts.
But yes, the excitement around Intel that used to be there is now gone. You can probably blame the "death of the PC" -- we teach kids coding in the Bay Area, and every single kid has an iPad tablet, but not a laptop. Sure professional engineers like us still have x86 laptops, but the average person does not.