This actually would be very cool when combined with an m.2 NIC [1] - ITX boards only have one PCIE slot and their built-in m.2 slots rarely have clearance for that height of heatsink, so getting both a GPU and 10gb networking is hard.
This actually seems like just a very clever market segmentation solution since the GPU was already limited to 8x PCIe lanes (its a laptop GPU see https://www.notebookcheck.net/NVIDIA-GeForce-RTX-4060-Laptop...). The 'addition' of the M.2 SSD makes it a unique offering. Limiting it to only one drive is another way to keep the thermal envelope down. Kudos to the Asus design and product development folks.
Most desktop class boards can only support bifurcating a 16x lane slot into two 8x lane slots, so it's more about preventing user error, as 2 M.2 slots would likely not function in most cases.
Even on server class hardware, the split options for a single slot are usually (but not always) 16x, 2* 8x, and 4* 4x - an 8x + 2* 4x option is somewhat unusual.
I’ve been thinking about this a lot recently - pcie slots. They need to be deprecated. We should be using mini-sas style breakout cables and reduce motherboard sizes.
It’s particularly a problem with huge top of the line GPU’s like the 7900 XTX and 4090. They are so long and so heavy that they sag. To work around it, we have kickstands and brackets that are added on the far end (opposite the external slot) to prop them up. Vertical mounts exist, but they’re a very wide ribbon the same width as the slot that will get in the way of lots of stuff.
Why aren’t we innovating here? Big GPU’s are so big they will often block all the slots on the board anyway. Manufacturers are shifting pcie lanes to m.2 on platforms without many lanes. The slots need to go or remain only for legacy use.
It’ll help with things like this too. The 4060 is using a full slot that it doesn’t need so those wasted lanes are now available for an m.2 card. IMHO all this should be modular like a less polished usb-c/thunderbolt interface. Minisas comes to mind but I know the server market is doing things with pushing pcie over a breakout cable.
ATX feels so antiquated as a form factor right now.
Not my field, but high speed (high frequency) data transmission cables must overcome several factors related to the interferences caused by the current and voltage of surrounding wires through inductance, and also circumvent the signal degradation due capacitance, resistance and others within the wire per se. With nowadays' PCIe gens we would be talking about frequencies around 8Ghz, 16 GHz and the 32 GHz of the next generation.
When motherboards are designed, the interference and signal integrity between traces/lanes needs to be calculated and balanced for to avoid those issues, but I'm not sure could be done effectively with flexible electrical cables at PCIe frequencies.
But lets say the interferences and signal integrity were stable. I suspect that for to reduce the number of wires within the cables needed by the PCIe lanes -for to avoid the width of classical IDE hard drives cables or something rigid?- it would be needed to increase the speed of the data transmission for a serialized cable equivalent to the source PCIe parallelized lanes, with frequencies wildly higher.
So, as alternative we would be talking about an hybrid of optical data cables + clocking electrical wires then? but the latency/synchro introduced by the optical transceivers (that would need to work at higher frequencies than PCIe bus), each manufacturer using more optimal or less optimal optical transceivers from different brands, etc, I suspect that would introduce a new kind of issue invading the users forums (or maybe not, nevertheless I smell it) due such cables.
The matter is, maybe the above could reduce a bit the size of the motherboard, but not the volume occupied by the targeted devices (lets say GPU), it would be just for to repositioning those devices to a new place. so, although I do believe that the technology in general has been stagnant for a few years, in this case, IMHO, the GPUs are the problem, the GPU's manufacturers are not innovating at all, and due that the bricks.
Back in the day "Infiniband" was going to be the successor to things like PCIe, such that you could network together discrete components and allow them to communicate/share their resources among multiple hosts.
Infiniband never took off though, though it's still around as a high speed networking interconnect (and gets active development for some industries like HPC).
Close, things like PCI and AGP (i.e. fast/wide parallel busses), not PCIe.
Everyone in the late-90's recognized that parallel was maxed out and GHz SerDes with embedded clock recovery, adaptive equalization, lane skew compensation, error detection, etc. was the future. Future I/O (IBM, HP, Compaq, 3Com, Cisco, ...) and NGIO (Sun, Dell, Intel, ...) were competing efforts that eventually merged and then rebranded by Intel as InfiniBand. But IB had the usual design-by-committee disease as it tried to shoehorn in networking, io, and system interconnect roles. Intel then bailed from the effort and serialized PCI instead. Intel tried to get back into that game in the 2010's with OmniPath without success.
Cases in the traditional "desktop" flat style have vertical GPU mounts for free. It's interesting that there's very little activity in that sector now, for fullsize motherboard support, especially since we've decided we no longer need to design a case around drive bays.
I do think the PCI-E to M.2 migration is getting a bit silly. My current mainboard has six M.2 slots, with four different levels of feature support (one intended for the preinstalled Wi-fi, one that can be SATA or PCI-E 3.0 x2, three PCI-e 4.0,4 one 5.0x4) Yet, if I said "I want a SCSI or SAS card", no dice, because there's only two x1 slots, and most cards seem to be x4.
The four-slot GPU problem might be more manageable if GPUs with AIO liquid coolerss included became more of a norm. That would at least let you move the majority of the bulk of the cooler away from the slot area. But usually this is reserved for ultra-expensive enthusiast cards.
I would like to see mini-sas as well, but simply for being a bit more flexible in case design and potentially separating heat generating components instead of putting everything in the same box.
But the issue with GPU sag is AFAIK a different one. Jayz2cents had a pretty good video a while ago demonstrating how to fix sag on the case itself without one of the little stands. From personal experience I can say the Gainward Phantom 4090 doesn’t sag in a Fractal Torrent for example.
I am not a pro in the subject, but does it mean the data still needs to reach the bridge on cpu first? What I am thinking of now is that with DX12 there's already a direct access to SSD's for fastes data copy to NVRAM, why then couldn't this benefit of losing some hops whenever reaching for data that is already onboard?
Not latency, but bandwidth to the GPU, matters for asset loading. Can the GPU load assets directly from its own SSD, as the PS5 does, or is this just an SSD the processor can use as a "disk"?
Extra hops on the PCIe link has even less of an impact on bandwidth than on latency.
This product is exposing the SSD directly to the host CPU because that's the only way to make the SSD useful. There's approximately zero software infrastructure for directly accessing an SSD from a GPU; nobody's running a NVMe driver and filesystem code entirely on the GPU, and even if you did that would be a non-starter in the consumer space because it would effectively require reserving the entire SSD for use by a single application.
It is possible on Linux to have GPU code issue storage requests over io_uring (if the kernel is polling that queue; the GPU cannot directly issue a syscall to make the kernel start checking for new IO requests). But that request still is handled on the CPU as it passes through the OS filesystem/storage stack before the NVMe SSD is instructed to DMA the requested data directly to/from the GPU's VRAM.
Microsoft's DirectStorage is (among other things) their effort to enable similar functionality for at least some use cases.
That’s unlikely given that there’s effectively no difference in framerates between x8 and x16 slots, even though the bandwidth doubles. Bandwidth is pretty clearly not the bottleneck.
nVidia generally disables/does not use ReBAR as it either shows no improvement or degrades performance. AMD may be a different story, but that isn't what this card is.
That’s very incorrect. Nvidia requires that every piece of firmware along the way be updated to support it, and won’t enable it unless they are. However if they are, depending on the game there’s definitely uplift.
Most motherboards I’ve looked at in the last 4 years have at least a couple M.2 slots. I built a new small form factor PC in 2020 w/ AMD Zen 3 in a DAN A4 case, and haven’t filled up the 2tb of M.2 Samsung storage in my single populated slot. I’m curious what kind of price point this feature makes sense for. Old motherboard without M.2 but new enough to support bifurcation for the PCI 5? Doesn’t seem like a big window for to me, but I’m probably missing something.
Not if you can't use them. The problem with modern PCs is lack of pcie lanes. Unless you go with Threadripper you pretty much get one GPU, two m.2 drives and one pci slot. After that decisions have to be made on what gets disabled.
I'm no expert here, but isn't this the number of lanes used at one time? If my understanding is correct, then generally more slots should still be better. Sure, maybe you can't write to 30 drives at the same time but who's actually doing that? You're probably reading from no more than 2 drives __at a time__. So 4 lanes per M.2 + a graphics card at 16 lanes gives you 22. Your $600 AMD Ryzen 9 7950X has 28 lanes but 22 are usable. So that of course doesn't leave you with extra lanes for other things, but that's not high end. Threadrippers have >80 lanes so...
Support for PCI-E bifurcation on consumer motherboards was really good for a while when PCI-E 4.0 first came out, however it has since regressed again.
Not many motherboards in the latest gen can do a proper x8+x4+x4 split. They may claim to support it but you run into weird issues has to spend weeks waiting for a bios fix. Or sometimes the support might be there but it is so poorly explained in the manual that I had to actually test it on hand to make sure that I was reading it correctly.
To be fair, HEDT motherboards are not immune to this type of stupidity. However they do manage to get over these issues with the sheer number of lanes.
AMD's mainstream desktop platform effectively has four more PCIe lanes coming out of the CPU socket than Intel's mainstream desktop platform. Both then have a variable number of PCIe lanes fanned out from the chipset and sharing that chipset's uplink to the CPU. (Intel has a range of chipsets with different number of lanes enabled, and AMD's more expensive chipset solution is to daisy-chain a second chipset).
Depends on your internet connection. If you have gigabit fibre to the home, downloading a big AAA game isn't a big deal. If you're on a 30 Mbit cable connection, the delay between starting a download and playing a game is pretty long, so it's nice to be able to keep them on your local storage.
It's a US thing, but it depends on your ISP. I have Spectrum, which doesn't have data caps, but Xfinity/Comcast caps you at 1.2TB/month, charging $10/50GB after that, or am alternativse, more expensive unlimited plan.
Most motherboards have a single dedicated M.2 from the CPU, a second one shared with the PCIe 4x slot and some have a third one from the chipset, usually slower. If you want a 10 Gbps NIC in your PCI 4x slot, second M.2 is gone and, if you don't have a third one from the chipset then you are limited to a single M.2 and Asus gives you the second one. That's exactly my case with one of the desktops, I would pay $10-20 extra for that extra M.2.
That won’t work electrically, and it’d have crippling performance problems if it did. Thunderbolt is not that much bandwidth compared to gen4 x16 pcie, and there are already latency problems to boot.
Do any Thunderbolt eGPU enclosures provide a full x16 slot? The M.2 slot on this card won't work if the card is installed in a slot that only has x4 or x8 lanes provided.
No they do not, however, it has never been totally clear to me if the TB-Pcie bridge they use in eGPU enclosures provides x16 slot and the "x4 equivalent speed" bottleneck is in the TB link, or if it's physically an x16 slot that only has x4 wired up. If it's the former, it will work.
Looking through the specs on ark.intel.com, the Thunderbolt controllers that specify PCIe lane counts say at best they support one x4 or four x1 links, so any eGPU enclosure using those would need to connect the Thunderbolt controller to a large and expensive PCIe packet switch in order to provide more than four lanes of any speed to a GPU.
2tb is more than enough to hold the games I’m currently playing. When I run out of space going to install a new game, there’s always a few hundred GB of games I don’t play anymore that I can delete. I don’t hoard any kinds of media or personal data on the Windows machine.
This looks to be using PCIe 5, which is 2x the bandwidth of PCIe 4. Considering a RTX 4090 is still using PCIe 4x16, so effectively the same bandwidth as PCIe 5x8, it should be fine.
This solution allows the SSD to use PCIe gen5 speed, but does not include a full PCIe switch so PCIe gen4 x16 from the GPU cannot be repackaged into PCIe gen5 x8 for the trip up to the CPU. Which is why this is being introduced on a GPU that only does PCIe gen4 x8.
I wish more motherboards would just have the ability to split it first place. No real advantage to doing the physical splitting of the lanes from the motherboard on the GPU body instead.
Hm, without a consumer equivalent to GPU Direct Storage (PCIe P2P DMA), this is not as cool as it could be. Still has to bounce to and from the CPU for no good reason.
Isn't this the one DMA feature actually supported on consumer GPUs?
I know it's not possible for NICs to directly write data into GPU Memory unless one uses Quadro or data center cards, but loading stuff from nvme drives into the GPU without involving host memory should be possible on all recent GPUs.
Not that I'm aware of: https://developer.nvidia.com/gpudirect-storage is only supported on Linux. Microsoft DirectStorage isn't GPU Direct Storage, annoyingly enough. No idea what AMD's equivalent is though their FPGA's would be able to do anything.
I feel like we need to rethink pc architecture entirely. GPU’s are so insane now that we almost need a motherboard on top of a motherboard, one for cpu and one for gpu where the gpu is also socketed and can be upgraded. Memory too. Would allow board makers to deliver better power solutions and have less coil whine. Thinking like a pi hat bolted on top. Having its own ram slots and whatnot. Or more realistically it would go on the back, and the motherboard would move closer to the center of the chassis versus being on one side. Graphics cards are huge and having a massive card hanging off a tiny little slot with a cooler that is oftentimes bigger than the cpu cooler is just insane to me. We are at the long tail of a perpendicular mounted card methinks. Can’t really take it much further. Vertical mounting will soon become a necessity so I think we need to go back to first principles and rethink from the beginning.
Yes. Video memory runs on a substantially wider bus than system memory; the connectors that'd be required to make it replaceable would be extraordinarily expensive -- to the extent that it'd probably be cheaper to build the cards with their maximum complement of memory to begin with.
I thought it was going to use the SSD as a slower memory cache for the GPU, but this is just literally a regular M2 hard drive slot built to use the PCIE lanes from the GPU...
And here I thought I am going to have 1TB of GPU memory for AI lmao.
[1] https://www.aliexpress.us/item/3256805824691996.html?spm=a2g...