Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You might want to try caching to a file with mlx.

https://github.com/ml-explore/mlx-examples/pull/956

edit: here's a quick example for qwen2.5-1M from a mlx dev

https://x.com/awnihannun/status/1883611098081099914



That's cool, than you, but does MLX support the Qwen 1M context yet?


According to the tweet, not the full context yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: