Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

SWE-Bench is disappointing not because it is lower than Claude, but because improving on all other domains of knowledge didn't help. So does this mean that this is actually a MoE model in the sense that one expert doesn't talk to the other ?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: