Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
msoad
9 months ago
|
parent
|
context
|
favorite
| on:
GPT-4.5 or GPT-5 being tested on LMSYS?
Prompt: my mother's sister has two brothers. each of her siblings have at least one child except for the sister that has 3 children. I have four siblings. How many grandchildren my grandfather has? Answer only with the result (the number)
ChatGPT4: 13
Claude3 Opus: 10 (correct)
GPT2-Chatbot: 15
msoad
9 months ago
|
next
[–]
By removing "Answer only with the result" all models can answer this correctly by doing "chain of thoughts"
7734128
9 months ago
|
prev
[–]
It's impossible to answer. "at least one child" could mean much more than one.
Also there could be more sisters.
Join us for
AI Startup School
this June 16-17 in San Francisco!
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
ChatGPT4: 13
Claude3 Opus: 10 (correct)
GPT2-Chatbot: 15