Reasoning models do a lot better at AIME than non-reasoning models, with o3 mini...

		meatmanek 24 days ago \| parent \| context \| favorite \| on: Qwen3-4B-Thinking-2507 Reasoning models do a lot better at AIME than non-reasoning models, with o3 mini getting 85% and 4o-mini getting 11%. It makes some sense that this would apply to small models as well.