Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is pretty exciting. Now an organization could produce an open weights mixture of experts model that has 8-15b active parameters but could still be 500b+ parameters and it could be run locally with INT4 quantization with very fast performance. DeepSeek R1 is a similar model but over 30b active parameters which makes it a little slow.

I do not have a good sense of how well quality scales with narrow MoEs but even if we get something like Llama 3.3 70b in quality at only 8b active parameters people could do a ton locally.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: