PCI devices have always been able to read and write to the shared address space ...

namibj · 2024-04-12T19:56:21 1712951781

Fixed large bar exists in some older accelerator cards like e.g. iirc the MI50/MI60 from AMD (the data center variant of the Radeon Vega VII, the first GPU with PCIe 4.0, also famous for dominating memory bandwidth until the RTX 40-series took that claim back. It had 16GB of HBM delivering 1TB/s memory bandwidth).

It's notably not compatible with some legacy boot processes and iirc also just 32bit kernels in general, so consumer cards had to wait for resizable BAR to get the benefits of large BAR (that being notably direct flat memory mapping of VRAM so CPUs and PCIe peers can directly read and write into all of VRAM, without dancing through a command interface with doorbell registers. AFAIK it allows a GPU to talk directly to NICs and NVMe drives by running the driver in GPU code (I'm not sure how/if they let you properly interact with doorbell registers, but polled io_uring as an ABI would be no problem (I wouldn't be surprised if some NIC firmware already allows offloading this).