8 GB of RAM with local LLMs in general is iffy: a 8-bit quantized Qwen3-4B is 4.2GB on disk and likely more in memory. 16 GB is usually the minimum to be able to run decent models without compromising on heavy quantization.
I've heard good things about how macOS handles memory relative to other operating systems. But Linux and Windows both have memory compression nowadays. So the claim is then not that memory compression makes your RAM twice as effective, but that macOS' memory compression is twice as good as the real and existing memory compression available on other operating systems.
It's 4-bit quantized (Q4_K_M, 2.5 GB) and still works well for this task. It's amazing. I've been running various small models on this 8 GB Air since the first Llama and GPT-J, and they improved so much!
macOS virtual memory works well on swapping in and out stuff to SSD.