Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the primary thing you're missing is that Qwen3-235B-A22B-Instruct-2507 != Qwen3-Coder-480B-A35B-Instruct. And the difference there is that while both do tons of code RL, in one they do not monitor performance on anything else for forgetting/regression and focus fully on code post-training pipelines and it is not meant for other tasks.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: