The smallest quantized version of the large MoE model on ollama is 143GB: https:...

whbrown · 2025-04-29T03:57:25 1745899045

Running the 3 bit quant of https://huggingface.co/unsloth/Qwen3-235B-A22B-GGUF now on a 128GB macbook.

daemonologist · 2025-04-29T03:57:48 1745899068

Smaller quantizations are possible [1], but I think you're right in that you wouldn't want to run anything substantially smaller than 128 GB. Single-GPU on 1x H200 (141 GB) might be feasible though (if you have some of those lying around...)

[1] - https://huggingface.co/unsloth/Qwen3-235B-A22B-GGUF/tree/mai...