Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Deepseek Coder v2 and Qwen2 are both great at 32k context. Can’t tell the difference between those models at 8k and 32k fully utilised. The difference in quality between them and 8k models when doing codegen is night and day. Not to mention that many of the little 8k models also have sliding window at 4k which essentially makes them 4k models.


I agree, they're exceptional models, however this can not be said of all models that boast a large context window.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: