Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Getting 64 GB of VRAM for $3.5k is a lot cheaper than buying the equivalent Nvidia discrete GPUs.


Maybe Intel will start offering cheap, high capacity ARC dGPUs as a power play? That would certainly be disruptive.

But yeah, AMD/Nvidia are never going offer huge memory pools affordably on dGPUs.


It's also interesting that this opens up the full saturation of Apple Silicon (minus the ANE): GGML can run on the CPU, using NEON and AMX, while another instance could run via Metal on the GPU using MLC/dawn. Though the two couldn't share (the same) memory at the moment.


The GPU's ML task energy is so much lower that you'd probably get better performance running everything on the GPU.

I think some repos have tried splitting things up between the NPU and GPU as well, but they didn't get good performance out of that combination? Not sure why, as the NPU is very low power.


This was a really insightful explanation, thanks.

I have been wanting to get a beefier Mac Studio/mini m2 the more

I’m seeing Apple Silicon specific tweaked packages.


You can get it for a lot less from https://frame.work

But 64G of VRAM is not the same as GPU mem, apples and oranges


Where does Framework offer 64 GB of VRAM? By VRAM I am referring to GPU RAM, yes.


Technically any newish laptop with 64GB of RAM has 64GB of "VRAM," but right now the Apple M series and AMD 7000 series are the only IGPs with any significant ML power.


I’m not sure what you mean. Typically, an iGPU slices off part of RAM for the GPU at boot time, which means it’s fixed and not shared. When did this change?


For Intel, it seems that per their chart under "What is the maximum amount of graphics memory or video memory my computer can use?" and discussion under "Will adding more physical memory increase my graphics memory amount?" at https://www.intel.com/content/www/us/en/support/articles/000..., iGPUs included with 5th gen/Broadwell processors were their first to do so in 2014.


Its fixed at boot but (on newer IGPs) can grow beyond the initial capacity.


Full unified memory came 10 ish years ago (also powering the PS4) but I think hw ability to adjust iGPU memory without booting predated that, Intel seems to have called it DVMT.


I was wondering about the integrated gpu in my desktop ryzen 7900x.

I can find very little about it (or other 7000 series integrated gpus). Is this usable at all for running LLaMa in some way?


Doesn't it all boil down to bandwidth?

AMDs IGP are way less attractive because they use rather slow DDR4/5 memory while the M2 has blazing fast memory integrated in the package.

We're talking about 50 GB/s vs 400 GB/s. Nvidia's A100 has 1000 GB/s.

Memory bandwidth is usually the bottleneck in GPU performance as many kernels are memory-bound (look up the roofline performance model).


The AMD 6000 series has 128-bit LPDDR5 as an option, the 7000 series had LPDDR5X. This is similar to the M1/M2.

The Pro/Max have double/quadruple that bus width. But they are much bigger/more expensive chips.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: