What's amazing to see is the effort to attempt to run the models on consumer-grade hardware, going as far as running 4-bit quantized models on phones or raspberry pi. All the debacle about mmap optimizations to llama.cpp [1] and the style these were committed to the product is a great testimony of open source. Both in the positive aspect (progress) and the negative ones (visibility affecting human judgement and collaboration). The sheer amount of experimentation is also providing a standard interface for different models that can easily be integrated and tried out.
[1] https://github.com/ggerganov/llama.cpp