You can't run deepseek-v3/r1 on the RTX Pro 6000, not to mention the upcomming 1...

112233 · 2025-06-28T12:08:38 1751112518

I can run full deepseek r1 on m1 max with 64GB of ram. Around 0.5 t/s with small quant. Q4 quant of Maverick (253 GB) runs at 2.3 t/s on it (no GPU offload).

Practically, last gen or even ES/QS EPYC or Xeon (with AMX), enough RAM to fill all 8 or 12 channels plus fast storage (4 Gen5 NVMEs are almost 60 GB/s) on paper at least look like cheapest way to run these huge MoE models at hobbyist speeds.

marci · 2025-07-03T13:48:17 1751550497

If you're talking about Deepseek r1 with llama.cpp and mmap, then at this point you can run deepseek r1 on a raspberry zero with a 256GB micro sdcard and a phone charger. The only metric left to know is one's patience.