> I think your "advanced animatronic" is a duck until you can devise a test that...

og_kalu · 2024-09-03T18:39:50 1725388790

>I think you're reaching and willfully misinterpreting.

That's the spirit of the popular phrase isn't it ? I genuinely never read that and thought that literally only those 3 properties were the bar.

>I am unable to because any test I make or from any other researcher in the field makes will result in an answer you don't like.

This is kind of an odd response. I mean maybe. I'll be charitable and agree wholeheartedly.

But i don't know what my disappointment has to do with anything? I mean if i could prove something i deeply believed to be True and which would also be paper worthy, i certainly wouldn't let the reactions of an internet stranger stop me from doing it.

>River crossing puzzles are a common test. Sure, humans sometimes fail even the trivial variations, but the important part is how they fail. Humans guess the wrong answer. LLMs will tell you the wrong answer while describing steps that are correct and result in a different answer. It's the inconsistency and contradiction that's the demonstration of lack of reasoning and intelligence, not the failure itself.

Inconsistency and contradiction with the reasoning they say (and even believe) and the decisions they make is such a common staple of human reasoning we have a name for it...At worst, you could say these contradictions don't always take the same form but this just kind of loops back to my original point.

Let me be clear here, if you want to look at results like these and say - "This is room for improvement", then Great!, I agree. But it certainly feels like a lot of people have a standard of reasoning (for machines) that only exists in fiction or their own imaginations. This general reasoning engine that makes neither mistake nor contradiction in output or process does not exist in real life whether you believe humans are the only beings capable of reasoning or are gracious enough to extend this capability to some of our animal friends.

Also, i've seen LLMs fail trivial variations of said logic puzzles, only to get them right when you present the problem in a way that doesn't look exactly like the logic puzzle they've almost certainly memorized. Sometimes, it's as simple as changing the names involved. Isn't that fascinating ? Humans have a similar cognitive shortcoming.