Hacker News new | past | comments | ask | show | jobs | submit login

> Are the people who can't spot the flaw just "looking like they are reasoning", but really they just lack the ability to reason?

Lacking relevant information or insight into a topic, isn't the same as lacking the ability to reason.

> You can't just spout that an LLM lacks reasoning without first strictly defining what it means to reason.

Perfectly worded definition available on Wikipedia:

    Reason is the capacity of consciously applying logic by drawing conclusions from new or existing information, with the aim of seeking the truth.
"Consciously", "logic", and "seeking the truth" are the operative terms here. A sequence predictor does none of that. Looking at my above example: The sequence "Mike leaves the elevator first" isn't based on logical thought, or a conscious abstraction of the world built from ingesting the question. It's based on the fact that this sequence has statistically a higher chance to appear after the sequence representing the question.

How does our reasoning work? How do humans answer such a question? By building an abstract representation of the world based on the meaning of the words in the question. We can imagine Mike and Jenny in that Elevantor, we can imagine the elevator moving, floor numbers have meaning in the environment, and we understand what "something is higher up" means. From all this we build a model and draw conclusions.

How does the "reasoning" in the LLM work? It checks which tokens are likely to appear after another sequence of tokens. It does so by having learned how we like to build sequences of tokens in our language. That's it. There is no modeling of the situation going on, just stochastic analysis of a sequence.

Consequently, an LLM cannot "seek truth" either. If a sequence has a high chance of appearing in a position, it doesn't matter if it is factually true or not, or even logically sound. The model isn't trained on "true or false". It will, likely more often than not say things that are true, but not because it understands truth, but because the training data contain a lot of token sequences that, when interpreted by a human mind, state true things.

Lastly, imagine trying to apply a language model to an area that depends completely on the above definition of reasoning as a consequence of modeling the world based on observations and drawing new conclusions from that modeling.

https://www.spiceworks.com/tech/artificial-intelligence/news...




You must have missed the part where I said:

> Until we can come up with hard metrics that define these terms, nobody is correct when they spout their own nonsense that somehow proves the LLM doesn't fit into their specific definition of fill in the blank.

"Consciously", "logic", and "seeking the truth" are not objectively verifiable metrics of any kind.

I'll repeat what I said: Until we come up with hard metrics that define these terms, nobody can be correct. I'll take investopedia's definition for what a metric means, as that embodies the idea I was getting at the most succinctly:

> Metrics are measures of quantitative assessment commonly used for assessing, comparing, and tracking performance or production.[0]

So, until we can quantitatively assess how an LLM performs compared to a human in "consciousness", "logic", and "seeking the truth", whatever ambiguous definition you throw out there will not confirm or deny whether an LLM embodies these traits as opposed to a human embodying these traits.

[0]: https://www.investopedia.com/terms/m/metrics.asp


To elaborate a bit on my own post here:

The sequence "Mike leaves the elevator first" has a high statistical probability. The sequence "Jenny leaves the elevator first" has a lower probability that that. But it probably has still a much higher probability than "Michael is standing on the Moon", which in turn may be more likely than "Car dogfood sunshine Javascript", which is still probably more likely than "snglub dugzuvutz gummmbr ha tcha ding dong".

Note that none of these sequences are wrong in the world of a language model. They are just increasingly unlikely to occur in that position. To us with our ability to reason by logically drawing conclusions from an abstract internal model of the world, all these other sequences either represent false statements, or nonsensical word sald.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: