Deepseek Coder v2 and Qwen2 are both great at 32k context. Can’t tell the differ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		smcleod on July 1, 2024 \| parent \| context \| favorite \| on: Gemma 2: Improving Open Language Models at a Pract... Deepseek Coder v2 and Qwen2 are both great at 32k context. Can’t tell the difference between those models at 8k and 32k fully utilised. The difference in quality between them and 8k models when doing codegen is night and day. Not to mention that many of the little 8k models also have sliding window at 4k which essentially makes them 4k models.

Grimblewald on July 2, 2024 [–]

I agree, they're exceptional models, however this can not be said of all models that boast a large context window.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact