Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> support for more than half a terabyte of unified memory

Soldered?



Is there a single Apple SoC where they’ve provided removable ram? Not that I can recall.


Is there even an existing replaceable memory standard that would meet the current needs of Apple's "Unified Memory" architecture? I'm not an expert but I'd suspect probably not. The bus probably looks a lot more like VRAM on GPUs, and I've never seen a GPU with replaceable RAM.


CAMM2 could kinda work, but each module is only 128-bit so I think the furthest you could possibly push it is a 512-bit M Max equivalent with CAMM2 modules north, east, west and south of the SOC. There just isn't room to put eight modules right next to the SOC for a 1024-bit bus like the M Ultra.


Framework said that when they built a Strix Halo machine, AMD assigned an engineer to work with them on seeing if there's a way to get CAMM2 memory working with it, and after a bunch of back and forth it was decided that CAMM2 still made the traces too long to maintain proper signal integrity due to the 256 bit interface.

These machines have a 512 bit interface, so presumably even worse.


Current (individual, not counting dual socketed) AMD Epyc CPUs have 576 GB/s over a 768 bit bus using socketed DIMMs.


My understanding is that works out due to the lower clock speeds of those RAM modules though right?

It's getting that bandwdith by going very wide on very very very many channels, rather than trying to push a gigantic amount of bandwidth through only a few channels.


Yeah, "channels" are just a roundabout way to say "wider bus" and you can't get too much past 128 GB/s of memory bandwidth without leaning heavily into a very wide bus (i.e. more than the "standard" 128 bit we're used to on consumer x86) regardless who's making the chip. Looking at it from the bus width perspective:

- The AI Max+ 395 is a 256 bit bus ("4 channels") of 8000 MHz instead of 128 bits ("2 channels") of 16000 MHz because you can't practically get past 9000 MHz in a consumer device, even if you solder the RAM, at the moment. Max capacity 128 GB.

- 5th Gen Epyc is a 768 bit bus ("12 channels") of 6000 MHz because that lets you use a standard socketed setup. Max capacity 6 TB.

- M3 Ultra is a 1024 bit bus ("16 channels") of "~6266 MHz" as it's 2x the M3 Max (which is 512 bits wide) and we know the final bandwidth is ~800 GB/s. Max capacity 512 GB.

Note: "Channels" is in quotes because the number of bits per channel isn't actually the same per platform (and DDR5 is actually 2x32 bit channels per DIMM instead of 1x64 per DIMM like older DDR... this kind of shit is why just looking at the actual bit width is easier :p).

So really the frequencies aren't that different even though these are completely different products across completely different segments. The overwhelming factor is bus width (channels) and the rest is more or less design choice noise from the perspective of raw performance.


Yeah, but AMDs memory controllers are really finnicky. That might have been more of a Strix Halo issue than a CAMM2 issue.


Entirely possible. Obviously Apple wouldn't have been interested in letting you upgrade the RAM even if it was doable.

I'd love to have more points of comparison available, but Strix Halo is the most analogous chip to an M-series chip on the market right now from a memory point of view, so it's hard to really know anything.

I very much hope CAMM2 or something else can be made to work with a Strix-like setup in the future, but I have my doubts.


I thought so too when they launched the M1, but I soon got corrected.

The memory bus is the same as for modules, it's just very short. The higher end SoCs have more memory bandwidth because the bus is wider (i.e. more modules in parallel).

You could blame DDR5 (who thought having a speed negotiation that can go over a minute at boot is a good idea?), but I blame the obsession with thin and the ability to overcharge your customers.

> I've never seen a GPU with replaceable RAM

I still have one :) It's an ISA Trident TVGA 8900 that I personally upgraded from 512k VRAM to one full megabyte!!!


It's really unfortunate that GPUs aren't fully customizable daughterboards, isn't it.


It's not soldered, it's _on the package_ with the SoC.


It is _not_ on die. It's soldered onto the package.

There's a good reason it's soldered, i.e. the wide memory interface and huge bandwidth mean that the extra trace lengths needed for an upgradable RAM slot would screw up the memory timings too much, but there's no need to make false claims like saying it's on-die.


> RAM slot would screw up the memory timings

Existing ones possibly but why not build something that lets you snap-in a BGA package just like we snap in CPUs on full sized PC mainboards?


The longer traces are the problem. They want these modules as physically close as possible to the CPU to make the timings work out and maintain signal integrity.

It's the same reason nobody sells GPUs that have user upgradable non-soldered GDDR VRAM modules.


Probably on package at best


Right, yes, sorry for imprecise language!


Thanks for clarifying


Not even Framework has escaped from soldered RAM for this kind of thing.


As are all Apple M devices.


Soldered?

Figure out a way to make it unified without also soldering it, and you'll be a billionaire.

Or are you just grinding a tired, 20-year-old axe.


_That_, in itself, wouldn't be that difficult, and there are shared-memory setups that do use modular memory. Where you'd really run into trouble is making it _fast_; this is very, very high bandwidth memory.


Like all intel/amd integrated graphics that use the systems ram as vram?


You know that memory can be "easily" de-soldered and soldered at home?

The issue is availability of chips and most likely you have to know which components to change so the new memory is recognised. For instance that could be changing a resistor to different value or bridging certain pads.


This viewpoint is interesting. It is not exactly inaccurate, but it does appear to be missing a point. Soldering in itself is a valuable and useful skill, but I can't say you can just get in and start de-soldering willy-nilly as opposed to opening a box and upgrading ram by plopping stuff in a designated spot.

What if both are an issue?


Do you know that "plopping stuff in a designated spot" can also be out of reach to some people? I know plenty who would give their computer to a tech do to the upgrade for them even if they are shown in person how to do all the steps. Soldering is just one step (albeit fairly big) above that. But the fact this can be done at home with fairly inexpensive tools, means tech person with reasonable skill could do it, so such upgrade could be accessible in computer/phone repair shop if parts were available to do so. Soldering is not a barrier - what I am trying to say.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: