I mean more would be certainly great, but with 20 you can at least run a x16 GPU and a x4 NVMe SSD, which is for my personal setup perfectly OK, I cannot do that with 16 lanes.
The chipset just multiplexes and it's fine. You're unlikely to actually use more than 16 lanes simultaneously in such a setup. But yes, Intel is being overly stingy with PCI-E lanes off of the CPU and leaning heavily on the chipset to compensate.
I'm hoping Ryzen 3rd gen bumps the lane count a bit more, otherwise Threadripper's 64 lanes look mighty nice...
The new Ryzens are meant to launch with PCIe 4.0 (this summer, I'm due an upgrade!), so effectively...
I'm surprised they don't come with more, especially many of the current ones come with 2-3 x16 slots, AND 2-3 M2s.. Both Intel and AMD have been adding more cores recently. Plus, if you're doing any GPU-based rendering, they're going to simultaneously move data back and forth from NVMe to the GPU's memory. But it grabs a headline and most synthetic benchmarks will only test RAM, graphics and storage in isolation - sneaky!
> Plus, if you're doing any GPU-based rendering, they're going to simultaneously move data back and forth from NVMe to the GPU's memory.
Games definitely don't do this at all, which is the primary market for a discreet GPU in these consumer platforms.
When they do stream in assets they do so slowly & in a controlled, rationed amount to minimize impact on FPS. They are far from being PCI-E bandwidth limited. That's kind of why you see almost no FPS drop at all when restricting GPUs to x8 bandwidth, even.
If you're doing something more workstation-y or custom, that'd be when AMD & Intel would point you at the HEDT platforms which have more than 16-20 lanes.
Are there any I/O heavy GPU workloads? Mining famously works with just one lane, offline rendering (e.g. Blender Cycles) I think also just uploads the whole scene once and then bounces the rays around…
There are GPU-accelerated database engines which can stream the database database from NVMe. I am not sure if they do direct device-to-device transfer which has only become supported recently or whether they still need to bounce through the main memory.
In raytracing complex scenes can exceed your VRAM so if you don't want to fall back to CPU tracing you need a renderer that can swap parts of the BVH in and out on demand.
> offline rendering (e.g. Blender Cycles) I think also just uploads the whole scene once and then bounces the rays around…
I'd imaging a workload like that or other HPC-compute workloads where the data set just doesn't fit in VRAM would certainly prefer more PCI-E lanes.
I think that'd usually be considered using the wrong hardware for the job but if you're just messing around as part of a hobby you're obviously not buying the $7000 Radeon Pro SSG, either.