Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One 3090 seems to be equivalent to one M3 Max at inference: https://www.reddit.com/r/LocalLLaMA/s/BaoKxHj8ww

There are many such threads on Reddit. M4 Max is incrementally faster, maybe 20%. Even if you factor in electricity costs, a 2x 3090 setup is IMO the sweet spot, cost/benefit wise.

And it’s maybe a zany line of argumentation, but 2x 3090 use 10x the power of an M4 Max. While the M4 is maybe the most efficient setup out there, it’s not nearly 10x as efficient. That’s IMO where the lack of compute power comes from.



What is the GPU memory on that 3090?


24GB VRAM. Using multiple ones scales well because models can be split by layers, and run in a pipelined fashion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: