Yep, we generally care about growing a few bandwidth #'s over current:

- GPU<>CPU/RAM

- GPU<>storage

- GPU<>network

(- GPU<>GPU bandwidth is already insane, as is GPU compute speed)

In the above, they're about cases like logs where there is ~infinite off-GPU data (S3, storage, ...), yet current PCI etc CPU stuff is like a tiny straw clogging it all.

It's now ~easy to do stuff like regex search on GPU, so systems being redesigned to quickly shove 1TB through a python 1-liner is awesome.

To get a feel for where all this is in practice, I did a fun talk w/ the pavilion team for this year's GTC on building graphistry UIs & interactive dashboards on top of this: https://pavilion.io/nvidia/

Edit: A good search term here is 'GPU Direct Storage', which is explicitly about skipping the CPU bandwidth indirection & performance handcuffs. Tapping directly into the network or storage is super exciting for matching what the compute tier can do!