Asus Intros GeForce RTX 4060 Ti Video Card with Integrated M.2 SSD Slot

charleslmunger · on Nov 25, 2023

This actually would be very cool when combined with an m.2 NIC [1] - ITX boards only have one PCIE slot and their built-in m.2 slots rarely have clearance for that height of heatsink, so getting both a GPU and 10gb networking is hard.

[1] https://www.aliexpress.us/item/3256805824691996.html?spm=a2g...

sarosh · on Nov 25, 2023

This actually seems like just a very clever market segmentation solution since the GPU was already limited to 8x PCIe lanes (its a laptop GPU see https://www.notebookcheck.net/NVIDIA-GeForce-RTX-4060-Laptop...). The 'addition' of the M.2 SSD makes it a unique offering. Limiting it to only one drive is another way to keep the thermal envelope down. Kudos to the Asus design and product development folks.

zdw · on Nov 25, 2023

Most desktop class boards can only support bifurcating a 16x lane slot into two 8x lane slots, so it's more about preventing user error, as 2 M.2 slots would likely not function in most cases.

Even on server class hardware, the split options for a single slot are usually (but not always) 16x, 2* 8x, and 4* 4x - an 8x + 2* 4x option is somewhat unusual.

whalesalad · on Nov 25, 2023

I’ve been thinking about this a lot recently - pcie slots. They need to be deprecated. We should be using mini-sas style breakout cables and reduce motherboard sizes.

It’s particularly a problem with huge top of the line GPU’s like the 7900 XTX and 4090. They are so long and so heavy that they sag. To work around it, we have kickstands and brackets that are added on the far end (opposite the external slot) to prop them up. Vertical mounts exist, but they’re a very wide ribbon the same width as the slot that will get in the way of lots of stuff.

Why aren’t we innovating here? Big GPU’s are so big they will often block all the slots on the board anyway. Manufacturers are shifting pcie lanes to m.2 on platforms without many lanes. The slots need to go or remain only for legacy use.

It’ll help with things like this too. The 4060 is using a full slot that it doesn’t need so those wasted lanes are now available for an m.2 card. IMHO all this should be modular like a less polished usb-c/thunderbolt interface. Minisas comes to mind but I know the server market is doing things with pushing pcie over a breakout cable.

ATX feels so antiquated as a form factor right now.

drtgh · on Nov 25, 2023

Not my field, but high speed (high frequency) data transmission cables must overcome several factors related to the interferences caused by the current and voltage of surrounding wires through inductance, and also circumvent the signal degradation due capacitance, resistance and others within the wire per se. With nowadays' PCIe gens we would be talking about frequencies around 8Ghz, 16 GHz and the 32 GHz of the next generation.

When motherboards are designed, the interference and signal integrity between traces/lanes needs to be calculated and balanced for to avoid those issues, but I'm not sure could be done effectively with flexible electrical cables at PCIe frequencies.

But lets say the interferences and signal integrity were stable. I suspect that for to reduce the number of wires within the cables needed by the PCIe lanes -for to avoid the width of classical IDE hard drives cables or something rigid?- it would be needed to increase the speed of the data transmission for a serialized cable equivalent to the source PCIe parallelized lanes, with frequencies wildly higher.

So, as alternative we would be talking about an hybrid of optical data cables + clocking electrical wires then? but the latency/synchro introduced by the optical transceivers (that would need to work at higher frequencies than PCIe bus), each manufacturer using more optimal or less optimal optical transceivers from different brands, etc, I suspect that would introduce a new kind of issue invading the users forums (or maybe not, nevertheless I smell it) due such cables.

The matter is, maybe the above could reduce a bit the size of the motherboard, but not the volume occupied by the targeted devices (lets say GPU), it would be just for to repositioning those devices to a new place. so, although I do believe that the technology in general has been stagnant for a few years, in this case, IMHO, the GPUs are the problem, the GPU's manufacturers are not innovating at all, and due that the bricks.

justinclift · on Nov 25, 2023

Back in the day "Infiniband" was going to be the successor to things like PCIe, such that you could network together discrete components and allow them to communicate/share their resources among multiple hosts.

Infiniband never took off though, though it's still around as a high speed networking interconnect (and gets active development for some industries like HPC).

gomijacogeo · on Nov 28, 2023

Close, things like PCI and AGP (i.e. fast/wide parallel busses), not PCIe.

Everyone in the late-90's recognized that parallel was maxed out and GHz SerDes with embedded clock recovery, adaptive equalization, lane skew compensation, error detection, etc. was the future. Future I/O (IBM, HP, Compaq, 3Com, Cisco, ...) and NGIO (Sun, Dell, Intel, ...) were competing efforts that eventually merged and then rebranded by Intel as InfiniBand. But IB had the usual design-by-committee disease as it tried to shoehorn in networking, io, and system interconnect roles. Intel then bailed from the effort and serialized PCI instead. Intel tried to get back into that game in the 2010's with OmniPath without success.

justinclift · on Nov 29, 2023

> Close, things like PCI and AGP (i.e. fast/wide parallel busses), not PCIe.

Thanks. Had a feeling I may have been wrong with "PCIe" term specifically, but couldn't be bothered looking up the exact details. :)

hakfoo · on Nov 25, 2023

Cases in the traditional "desktop" flat style have vertical GPU mounts for free. It's interesting that there's very little activity in that sector now, for fullsize motherboard support, especially since we've decided we no longer need to design a case around drive bays.

I do think the PCI-E to M.2 migration is getting a bit silly. My current mainboard has six M.2 slots, with four different levels of feature support (one intended for the preinstalled Wi-fi, one that can be SATA or PCI-E 3.0 x2, three PCI-e 4.0,4 one 5.0x4) Yet, if I said "I want a SCSI or SAS card", no dice, because there's only two x1 slots, and most cards seem to be x4.

The four-slot GPU problem might be more manageable if GPUs with AIO liquid coolerss included became more of a norm. That would at least let you move the majority of the bulk of the cooler away from the slot area. But usually this is reserved for ultra-expensive enthusiast cards.

fallenhitokiri · on Nov 25, 2023

I would like to see mini-sas as well, but simply for being a bit more flexible in case design and potentially separating heat generating components instead of putting everything in the same box.

But the issue with GPU sag is AFAIK a different one. Jayz2cents had a pretty good video a while ago demonstrating how to fix sag on the case itself without one of the little stands. From personal experience I can say the Gainward Phantom 4090 doesn’t sag in a Fractal Torrent for example.

whalesalad · on Nov 25, 2023

It doesn’t work on the huge cards. Your PCB will bend. Ask me how I know.

Maybe if you’ve got a really great/thick backplate but not all cards do. It’s too much stress regardless.

ixmerof · on Nov 24, 2023

I am not a pro in the subject, but does it mean the data still needs to reach the bridge on cpu first? What I am thinking of now is that with DX12 there's already a direct access to SSD's for fastes data copy to NVRAM, why then couldn't this benefit of losing some hops whenever reaching for data that is already onboard?

wtallis · on Nov 25, 2023

PCIe latency is a few orders of magnitude lower than NAND flash read latency, so the extra round trip to the CPU's PCIe root complex doesn't matter.

Animats · on Nov 25, 2023

Not latency, but bandwidth to the GPU, matters for asset loading. Can the GPU load assets directly from its own SSD, as the PS5 does, or is this just an SSD the processor can use as a "disk"?

wtallis · on Nov 25, 2023

Extra hops on the PCIe link has even less of an impact on bandwidth than on latency.

This product is exposing the SSD directly to the host CPU because that's the only way to make the SSD useful. There's approximately zero software infrastructure for directly accessing an SSD from a GPU; nobody's running a NVMe driver and filesystem code entirely on the GPU, and even if you did that would be a non-starter in the consumer space because it would effectively require reserving the entire SSD for use by a single application.

It is possible on Linux to have GPU code issue storage requests over io_uring (if the kernel is polling that queue; the GPU cannot directly issue a syscall to make the kernel start checking for new IO requests). But that request still is handled on the CPU as it passes through the OS filesystem/storage stack before the NVMe SSD is instructed to DMA the requested data directly to/from the GPU's VRAM.

Microsoft's DirectStorage is (among other things) their effort to enable similar functionality for at least some use cases.

c0pium · on Nov 25, 2023

That’s unlikely given that there’s effectively no difference in framerates between x8 and x16 slots, even though the bandwidth doubles. Bandwidth is pretty clearly not the bottleneck.

CyberDildonics · on Nov 25, 2023

PCIe bandwidth is not typically a bottleneck, it has kept ahead of what people need. 8 lanes of PCIe is already 16 GB/s

867-5309 · on Nov 25, 2023

surely that data copy would not be on the fly, it would be preloaded, SSD's ~8GB/s could not keep up with GPU's ~350GB/s

glitchc · on Nov 24, 2023

Yes, it does, although DMA means the CPU will not need to process it. It's like any other M2 drive in your PC.

nullindividual · on Nov 25, 2023

nVidia generally disables/does not use ReBAR as it either shows no improvement or degrades performance. AMD may be a different story, but that isn't what this card is.

c0pium · on Nov 25, 2023

That’s very incorrect. Nvidia requires that every piece of firmware along the way be updated to support it, and won’t enable it unless they are. However if they are, depending on the game there’s definitely uplift.

jitl · on Nov 24, 2023

Most motherboards I’ve looked at in the last 4 years have at least a couple M.2 slots. I built a new small form factor PC in 2020 w/ AMD Zen 3 in a DAN A4 case, and haven’t filled up the 2tb of M.2 Samsung storage in my single populated slot. I’m curious what kind of price point this feature makes sense for. Old motherboard without M.2 but new enough to support bifurcation for the PCI 5? Doesn’t seem like a big window for to me, but I’m probably missing something.

Arainach · on Nov 24, 2023

A 2TB drive is what, 8-10 AAA games? More drive slots is always better.

internet101010 · on Nov 25, 2023

Not if you can't use them. The problem with modern PCs is lack of pcie lanes. Unless you go with Threadripper you pretty much get one GPU, two m.2 drives and one pci slot. After that decisions have to be made on what gets disabled.

godelski · on Nov 25, 2023

I'm no expert here, but isn't this the number of lanes used at one time? If my understanding is correct, then generally more slots should still be better. Sure, maybe you can't write to 30 drives at the same time but who's actually doing that? You're probably reading from no more than 2 drives __at a time__. So 4 lanes per M.2 + a graphics card at 16 lanes gives you 22. Your $600 AMD Ryzen 9 7950X has 28 lanes but 22 are usable. So that of course doesn't leave you with extra lanes for other things, but that's not high end. Threadrippers have >80 lanes so...

Am I misunderstanding something?

Laforet · on Nov 25, 2023

Support for PCI-E bifurcation on consumer motherboards was really good for a while when PCI-E 4.0 first came out, however it has since regressed again.

Not many motherboards in the latest gen can do a proper x8+x4+x4 split. They may claim to support it but you run into weird issues has to spend weeks waiting for a bios fix. Or sometimes the support might be there but it is so poorly explained in the manual that I had to actually test it on hand to make sure that I was reading it correctly.

To be fair, HEDT motherboards are not immune to this type of stupidity. However they do manage to get over these issues with the sheer number of lanes.

wtallis · on Nov 25, 2023

AMD's mainstream desktop platform effectively has four more PCIe lanes coming out of the CPU socket than Intel's mainstream desktop platform. Both then have a variable number of PCIe lanes fanned out from the chipset and sharing that chipset's uplink to the CPU. (Intel has a range of chipsets with different number of lanes enabled, and AMD's more expensive chipset solution is to daisy-chain a second chipset).

arcticbull · on Nov 25, 2023

Sort of, there's also dedicated PCIe lanes (on AMD) or DMI lanes (on Intel) to the southbridge which fans them out to additional lanes for devices.

teaearlgraycold · on Nov 25, 2023

Seriously - who’s installing even 5 of those big games concurrently? I uninstall games pretty readily.

FirmwareBurner · on Nov 25, 2023

>who’s installing even 5 of those big games concurrently?

Gamers?

Sanzig · on Nov 25, 2023

Depends on your internet connection. If you have gigabit fibre to the home, downloading a big AAA game isn't a big deal. If you're on a 30 Mbit cable connection, the delay between starting a download and playing a game is pretty long, so it's nice to be able to keep them on your local storage.

1123581321 · on Nov 25, 2023

Your approach is atypical. Most don’t uninstall unless they’re upset with the game or space is running low.

Arainach · on Nov 25, 2023

With ISP data caps, redownloading them isn't free.

imran0 · on Nov 25, 2023

I don't understand data caps, is it a US thing?

askiiart · on Nov 25, 2023

It's a US thing, but it depends on your ISP. I have Spectrum, which doesn't have data caps, but Xfinity/Comcast caps you at 1.2TB/month, charging $10/50GB after that, or am alternativse, more expensive unlimited plan.

Comfy-Tinwork · on Nov 24, 2023

~50GB per game, so 40.

Retric · on Nov 25, 2023

Recently it’s gotten worse than that, a 2TB disk has about 1,820GB of usable space

STAR WARS Jedi: Survivor starts at 155 GB but you need over 100GB on top of that for patches etc, don’t be surprised if your game folder hits 240 GB.

Baldur's Gate 3: lists 150 GB as minimum disk space required.

Cyberpunk 2077 starts at 70 GB but the expansion is supposedly another 32GB and counting.

Alan Wake 2: 90GB disk space

Going forward it’s only getting worse.

godelski · on Nov 25, 2023

I recently installed Starfield and it's over 100GB. I'm not a big gamer, but I understand this is becoming increasingly common.

https://store.steampowered.com/app/1716740/Starfield/

smcleod · on Nov 24, 2023

Or 1 or 2 electron apps

dylan604 · on Nov 24, 2023

wait, i thought we were talking about SSDs not RAM /s

AdrianB1 · on Nov 24, 2023

Most motherboards have a single dedicated M.2 from the CPU, a second one shared with the PCIe 4x slot and some have a third one from the chipset, usually slower. If you want a 10 Gbps NIC in your PCI 4x slot, second M.2 is gone and, if you don't have a third one from the chipset then you are limited to a single M.2 and Asus gives you the second one. That's exactly my case with one of the desktops, I would pay $10-20 extra for that extra M.2.

c0pium · on Nov 25, 2023

Maybe if you’re looking at bargain basement boards, but most boards? No way.

AdrianB1 · on Nov 25, 2023

MSI B550 Mortar, mid-tier AM4 motherboard: 2 M.2 slots, one shared with the PCIe 4x slot. All the boards in that range are configured the same way.

toumorokoshi · on Nov 24, 2023

The article said it's for small form factor PCs - were the motherboards you were looking at designed for microatx or something similar?

jitl · on Nov 26, 2023

My build is Mini ITX

flashback2199 · on Nov 25, 2023

Will be pretty dope for eGPUs over thunderbolt

c0pium · on Nov 25, 2023

That won’t work electrically, and it’d have crippling performance problems if it did. Thunderbolt is not that much bandwidth compared to gen4 x16 pcie, and there are already latency problems to boot.

wtallis · on Nov 25, 2023

Do any Thunderbolt eGPU enclosures provide a full x16 slot? The M.2 slot on this card won't work if the card is installed in a slot that only has x4 or x8 lanes provided.

flashback2199 · on Nov 25, 2023

No they do not, however, it has never been totally clear to me if the TB-Pcie bridge they use in eGPU enclosures provides x16 slot and the "x4 equivalent speed" bottleneck is in the TB link, or if it's physically an x16 slot that only has x4 wired up. If it's the former, it will work.

wtallis · on Nov 25, 2023

Looking through the specs on ark.intel.com, the Thunderbolt controllers that specify PCIe lane counts say at best they support one x4 or four x1 links, so any eGPU enclosure using those would need to connect the Thunderbolt controller to a large and expensive PCIe packet switch in order to provide more than four lanes of any speed to a GPU.

flashback2199 · on Nov 25, 2023

Ah, it won't work then. Oh well.

smcleod · on Nov 24, 2023

Wow how do you keep your storage so low? I’m always running out of m.2 and SATA ports.

jitl · on Nov 26, 2023

2tb is more than enough to hold the games I’m currently playing. When I run out of space going to install a new game, there’s always a few hundred GB of games I don’t play anymore that I can delete. I don’t hoard any kinds of media or personal data on the Windows machine.

toumorokoshi · on Nov 24, 2023

It sounds like this splits the pcie lanes it provides into two, effectively halving bandwidth. Wouldn't this impact the performance of the GPU?

AdrianB1 · on Nov 24, 2023

No, for 4060 and 4060 TI the GPU is linked to only 8 of the 16 PCIe lanes, the other 8 are not used. Asus is using 4 of these unused lanes.

ThatPlayer · on Nov 24, 2023

This looks to be using PCIe 5, which is 2x the bandwidth of PCIe 4. Considering a RTX 4090 is still using PCIe 4x16, so effectively the same bandwidth as PCIe 5x8, it should be fine.

wtallis · on Nov 25, 2023

This solution allows the SSD to use PCIe gen5 speed, but does not include a full PCIe switch so PCIe gen4 x16 from the GPU cannot be repackaged into PCIe gen5 x8 for the trip up to the CPU. Which is why this is being introduced on a GPU that only does PCIe gen4 x8.

ThatPlayer · on Nov 25, 2023

You're right. I didn't see that spec for the 4060.

duskwuff · on Nov 25, 2023

> Wouldn't this impact the performance of the GPU?

It isn't a super high performance GPU to begin with, so the narrower bus is unlikely to make much of a difference.

mtillman · on Nov 24, 2023

Not the same but reminds me of the Radeon SSG which had a user manual for interacting with the ssd: https://www.amd.com/system/files/documents/ssg-api-user-manu...

zamadatix · on Nov 25, 2023

I wish more motherboards would just have the ability to split it first place. No real advantage to doing the physical splitting of the lanes from the motherboard on the GPU body instead.

nimish · on Nov 25, 2023

Hm, without a consumer equivalent to GPU Direct Storage (PCIe P2P DMA), this is not as cool as it could be. Still has to bounce to and from the CPU for no good reason.

KeplerBoy · on Nov 25, 2023

Isn't this the one DMA feature actually supported on consumer GPUs?

I know it's not possible for NICs to directly write data into GPU Memory unless one uses Quadro or data center cards, but loading stuff from nvme drives into the GPU without involving host memory should be possible on all recent GPUs.

nimish · on Nov 25, 2023

Not that I'm aware of: https://developer.nvidia.com/gpudirect-storage is only supported on Linux. Microsoft DirectStorage isn't GPU Direct Storage, annoyingly enough. No idea what AMD's equivalent is though their FPGA's would be able to do anything.

annoyingnoob · on Nov 25, 2023

But what I really want is a 4060 Ti that fits in an M.2 slot.

type0 · on Nov 25, 2023

I would rather see expandable RAM slot for the GPU, would that be hard to achieve?

whalesalad · on Nov 25, 2023

I feel like we need to rethink pc architecture entirely. GPU’s are so insane now that we almost need a motherboard on top of a motherboard, one for cpu and one for gpu where the gpu is also socketed and can be upgraded. Memory too. Would allow board makers to deliver better power solutions and have less coil whine. Thinking like a pi hat bolted on top. Having its own ram slots and whatnot. Or more realistically it would go on the back, and the motherboard would move closer to the center of the chassis versus being on one side. Graphics cards are huge and having a massive card hanging off a tiny little slot with a cooler that is oftentimes bigger than the cpu cooler is just insane to me. We are at the long tail of a perpendicular mounted card methinks. Can’t really take it much further. Vertical mounting will soon become a necessity so I think we need to go back to first principles and rethink from the beginning.

duskwuff · on Nov 25, 2023

> would that be hard to achieve?

Yes. Video memory runs on a substantially wider bus than system memory; the connectors that'd be required to make it replaceable would be extraordinarily expensive -- to the extent that it'd probably be cheaper to build the cards with their maximum complement of memory to begin with.

wmf · on Nov 25, 2023

Yes, it would destroy GPU performance.

c0pium · on Nov 25, 2023

RAM stability is already pretty tough with modern speeds without putting it across a PCIe link.

_kava · on Nov 24, 2023

I thought it was going to use the SSD as a slower memory cache for the GPU, but this is just literally a regular M2 hard drive slot built to use the PCIE lanes from the GPU...

And here I thought I am going to have 1TB of GPU memory for AI lmao.

LASR · on Nov 25, 2023

Yeah that’s what I thought too. But there is a reason why we don’t have this in data centers. So my BS meter went off.

Regular PCIE kind of makes sense. Although, it’s a strange spacing optimization.