I'm a bit surprised it gets this question wrong (ChatGPT gets it right, even on ...

bobbylarrybobby · 2026-02-17T19:48:37 1771357717

Interesting, my sonnet 4.6 starts with the following:

The classic puzzle actually uses *eight 8s*, not nine. The unique solution is: 888+88+8+8+8=1000. Count: 3+2+1+1+1=8 eights.

It then proves that there is no solution for nine 8s.

https://claude.ai/share/9a6ee7cb-bcd6-4a09-9dc6-efcf0df6096b (for whatever reason the LaTeX rendering is messed up in the shared chat, but it looks fine for me).

stevepike · 2026-02-18T00:19:30 1771373970

Yeah, earlier in the GPT days I felt like this was a good example of LLMs being "a blurry jpeg of the web", since you could give them something that was very close to an existing puzzle that exists commonly on the web, and they'd regurgitate an answer from that training set. It was neat to me to see the question get solved consistently by the reasoning models (though often by churning a bunch of tokens trying and verifying to count 888 + 88 + 8 + 8 + 8 as nine digits).

I wonder if it's a temperature thing or if things are being throttled up/down on time of day. I was signed in to a paid claude account when I ran the test.

malfist · 2026-02-17T19:36:22 1771356982

Chatgpt doesn't get it right: https://chatgpt.com/share/6994c312-d7dc-800f-976a-5e4fbec0ae...

``` Use digit concatenation plus addition: 888 + 88 + 8 + 8 + 8 = 1000 Digit count:

888 → three 8s

88 → two 8s

8 + 8 + 8 → three 8s

Total: 3 + 2 + 3 = 9 eights Operation used: addition only ```

Love the 3 + 2 + 3 = 9

simianwords · 2026-02-17T20:41:25 1771360885

chatgpt gets it right. maybe you are using free or non thinking version?

https://chatgpt.com/share/6994d25e-c174-800b-987e-9d32c94d95...

leumon · 2026-02-17T19:59:17 1771358357

My locally running nemotron-3-nano quantized to Q4_K_M gets this right. (although it used 20k thought tokens before answering the question)

layer8 · 2026-02-17T18:49:25 1771354165

Off-by-one errors are one of the hardest problems in computer science.

anonymous908213 · 2026-02-17T19:19:51 1771355991

That is not an off-by-one error in a computer science sense, nor is it "one of the hardest problems in computer science".

layer8 · 2026-02-17T19:23:13 1771356193

This was in reference to a well-known joke, see here: https://martinfowler.com/bliki/TwoHardThings.html