Yeah, I’m still debating whether I go with a Mac Studio with the RAM maxed out (...

karolist · on March 19, 2024

I have an M1 Max with 64GB and 3090 Ti. M1 Max is ~4x slower at inference for the same models than 3090 (i.e. 7t/s vs 30t/s), which depending on the task can be very annoying. As a plus you get to run really large models, albeit very slowly. Think if that will bother you or not. I will not give up my 3090 Ti and am rather waiting for 5090 to see what it can do because when programming, the Mac is too slow to shoot of questions. I use it mostly to better understand book topics now and 3090 Ti to do fast chat sessions.

kristianp · on March 19, 2024

You can get a previous gen RTX A6000 with 48GB of gddr6 for about $5000 (1). Disclosure: I run that website. Is anyone using the pro cards for inference?

(1) https://gpuquicklist.com/pro?models=RTX%20A6000

Oioioioiio · on March 19, 2024

Just don't max out the Mac Studio and get both...

chaxor · on March 19, 2024

Groq may be an option?