phi4 is 23%
deepseek r1 qwen distilled 32b is 27%
llama 3.3 70b is 29% same with llama 4 scout
gpt 4o is 31%
gpt 41 is 45%
qwen3 32b reasoning 55%!! Expecting qwen3 coder 30b to be around here?
kimi k2 55%
claude 4 around 60%
qwen3 coder 480b 58%
nemotron 49b 74%!!
glm 4.5 358b 74%
exaone 4 32b reasoning 74%!!
deepseek r1 685b 75%
grok4, o4mini, gemini2.5pro, 80%
phi4 is 23%
deepseek r1 qwen distilled 32b is 27%
llama 3.3 70b is 29% same with llama 4 scout
gpt 4o is 31%
gpt 41 is 45%
qwen3 32b reasoning 55%!! Expecting qwen3 coder 30b to be around here?
kimi k2 55%
claude 4 around 60%
qwen3 coder 480b 58%
nemotron 49b 74%!!
glm 4.5 358b 74%
exaone 4 32b reasoning 74%!!
deepseek r1 685b 75%
grok4, o4mini, gemini2.5pro, 80%