>Every other LLM I've tried include o3-mini-high: Fill the 12-liter jug complete... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		DebtDeflation on Feb 11, 2025 \| parent \| context \| favorite \| on: DeepScaleR: Surpassing O1-Preview with a 1.5B Mode... >Every other LLM I've tried include o3-mini-high: Fill the 12-liter jug completely. Pour it into the 6 liter jug. Try it with a 12L jug and a 4L jug and ask for 4L. See if it tells you to just fill the 4L or to fill the 12L and pour into the 4L twice discarding both times and there will be 4L remaining in the 12L jug. Even though it's still technically correct, it demonstrates that there's no real "reasoning" happening just regurgitation of training data.

CamperBob2 12 months ago [–]

(Shrug) R1 has no problem with that. To the extent it's confused, it is only because it is wondering if it's a trick question.

CoT reasoning: https://i.imgur.com/rjNmTGZ.png

Answer: https://i.imgur.com/WfAVeZQ.png

There's 'mindless regurgitation' going on here, but not by the AI model.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact