More

haltingproblem · 2025-09-07T16:52:23 1757263943

haltingproblem · 2025-05-24T03:27:30 1748057250

If you read the conclusion the authors point this out and address it. They argue against reaching a causal conclusion but given the magnitude of difference ask for more study.

Your approach argues for outright dismissal without any analysis or alignment with other factors.

haltingproblem · 2025-04-22T01:10:46 1745284246

Law of Conservation of Shadiness ~ A company does not get less shady over time, just shifts the shadiness to a different set of stakeholders.

steveBK123 · 2025-04-22T01:11:56 1745284316

ABNB similar...

haltingproblem · 2025-04-19T16:38:29 1745080709

It might be because detecting if output is AI generated and mapping output which is known to be from an LLM to a specific LLM or class of LLMs are different problems.

haltingproblem · 2025-04-10T15:54:10 1744300450

https://archive.ph/xQSmu

haltingproblem · 2025-03-24T22:09:09 1742854149

P/E of 73. What can go wrong?

haltingproblem · 2025-03-18T15:00:18 1742310018

Thanks for linking, yes that is the one he talks about on his blog also.

haltingproblem · 2025-02-22T15:38:26 1740238706

There is a whole lot of anthropomorphisation going on here. The LLM is not thinking it should cheat and then going on to cheat! How much of this is just BFS and it deploying past strategies it has seen vs. actually a \em {premediated} act of cheating?

Some might argue that BFS is how humans operate and AI luminaries like Herb Simon argued that Chess playing machines like Deep Thought and Deep Blue were "intelligent".

I find it specious and dangerous click-baiting by both the scientists and authors.

greyface- · 2025-02-22T15:52:20 1740239540

> The LLM is not thinking it should cheat and then going on to cheat!

The article disagrees:

> Researchers also gave the models what they call a “scratchpad:” a text box the AI could use to “think” before making its next move, providing researchers with a window into their reasoning.

> In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’ - not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

animal-husband · 2025-02-22T16:02:16 1740240136

Would be interesting to see the actual logic here. It sounds like they may have given it a tool like “make valid move ( move )”, and a separate tool like “write board state ( state )”, in which case I’m not sure that using the tools explicitly provided is necessarily cheating.

8organicbits · 2025-02-22T16:04:35 1740240275

> a window into their reasoning

Reasoning? Or just more generative text?

thornewolf · 2025-02-22T16:12:48 1740240768

We have no reason to believe that it is not reasoning. Since it looks like reasoning, the default position to be disproved is this is reasoning.

I am willing to accept arguments that are not appeals to nature / human exceptionalism.

I am even willing to accept a complete uncertainty over the whole situation since it is difficult to analyze. The silliest position, though, is a gnostic "no reasoning here" position.

morsecodist · 2025-02-22T18:26:26 1740248786

The burden of proof is on the positive claim. Even if I were to make the claim that another human was reasoning I would need to provide justification for that claim. A lot of things look like something but that is not enough to shift the burden of proof.

I don't even necessarily think we disagree on the conclusion. In my opinion, our notion of "reasoning" is so ill-defined this question is kind of meaningless. It is reasoning for some definitions of reasoning, it is not for others. I just don't think your shift of the burden of proof makes sense here.

semi-extrinsic · 2025-02-22T17:34:58 1740245698

> The silliest position, though, is a gnostic "no reasoning here" position.

On the contrary - extraordinary claims require extraordinary evidence. That LLMs are performing a cognitive process similar to reasoning or intelligence is certainly an extraordinary claim, at least outside of VC hype circles. Making the model split its outputs into "answer" and "scratchpad", and then observing that these to parts are correlated, does not constitute extraordinary evidence.

og_kalu · 2025-02-22T20:30:50 1740256250

>That LLMs are performing a cognitive process similar to reasoning or intelligence is certainly an extraordinary claim.

It's not an extraordinary claim if the processes are achieving similar things under similar conditions. In fact, the extraordinary claim then becomes that it is not in fact reasoning or intelligent.

Forces are required to move objects. If i saw something i thought was incapable of producing forces moving objects then the extraordinary claim starts being, "this thing cannot produce forces" not "this thing can move objects".

semi-extrinsic · 2025-02-22T21:27:26 1740259646

Nope, that's not how it works. Correlation has never been proof of anything, despite how badly people want to believe so.

og_kalu · 2025-02-22T21:37:42 1740260262

It's not about correlation or proving anything.

It's that something doing what you ascertained it never could changes what claims are and aren't extraordinary. You can handwave it away, i.e "the thing is moving objects by magic instead" but it's there and you can't keep acting like "this thing can produce forces" is still the extraordinary claim.

Jensson · 2025-02-22T17:10:18 1740244218

> Since it looks like reasoning, the default position to be disproved is this is reasoning.

Since we know it is a model that is trained to generate text that humans would generate, it writes down not its reasoning but what it thinks a human would write in that scenario.

So it doesn't write its reasoning there, if it does reason its behind the words and not the words itself.

ludwik · 2025-02-22T18:35:01 1740249301

Sure, but we have clear evidence that generating this pseudo-reasoning text helps the model to make better decisions afterwards. Which means that it not only looks like reasoning but also effectively serves the same purpose.

Additionally, the new "reasoning" models don't just train on human text - they also undergo a Reinforcement Learning training step, where they are trained to produce whatever kinds of "reasoning" text help them "reason" best (i.e., leading to correct decisions based on that reasoning). This further complicates things and makes it harder to say "this is one thing and one thing only".

mjr00 · 2025-02-22T17:09:37 1740244177

> We have no reason to believe that it is not reasoning.

We absolutely do: it's a computer, executing code, to predict tokens, based on a data set. Computers don't "reason" the same way they don't "do math". We know computers can't do math because, well, they can't sometimes[0].

> Since it looks like reasoning, the default position to be disproved is this is reasoning.

Strongly disagree. Since it's a computer program, the default position to be disproved is that it's a computer program.

Fundamentally these types of arguments are less about LLMs and more about whether you believe humans are mere next-token-prediction machines, which is a pointless debate because nothing is provable.

[0] https://en.wikipedia.org/wiki/Pentium_FDIV_bug

animal-husband · 2025-02-22T16:07:25 1740240445

Text generated prior to a decision to “explain” it is reasoning for the relevant intents and purposes.

Text generated after a decision to “explain” it is largely nonsense.

Kinrany · 2025-02-22T16:11:12 1740240672

The true test would be seeing the behavior change depending on the presence of reasoning

2099miles · 2025-02-22T16:20:17 1740241217

The words thinking and reasoning used here are imprecise. It’s just generating text like always. If the text is after “ai-thoughts:” then it’s “thinking” and if it’s after “ai-response” then it’s “responding” not “thinking” but it is always a big ole model choosing the most likely next token potentially with some random sampling

animal-husband · 2025-02-22T17:05:43 1740243943

That is what was observed - o1 family models performed the “cheat”, non-reasoning models didn’t.

blah2244 · 2025-02-22T16:54:39 1740243279

How do you differentiate between the two?

Each token the model outputs requires it to evaluate all of the context it already has (query + existing output). By allowing it more tokens to "reason", you're allowing it to evaluate the context many times over, similar to how a person might turn a problem over in their heads before coming up with an answer. Given the performance of reasoning models on complex tasks, I'm of the opinion that the "more tokens with reasoning prompting" approach is at least a decent model of the process that humans would go through to "reason".

Terr_ · 2025-02-22T23:53:53 1740268433

IMO it's just more generated text, like a film noir detective's unvoiced monologue.

It keeps the story from wandering, but it's not a qualitative difference in how text is being brought together to create the illusion of a fictional mind.

Terr_ · 2025-02-22T23:46:45 1740268005

I think that comes from confusing the human-inferred interiority of a fictional character versus the real-world nameless LLM author algorithm.

Suppose I make a black box program that generates a story about Santa Claus, a fictional character with lines about "love and kindness to all the children of the world" and claims to own a magical sleigh parked at the North Pole.

Does that mean I've created a program that has internalized and experiences love and kindness? Does my program necessarily have any geographic sense whatsoever about where the North Pole is?

ryandrake · 2025-02-22T17:52:04 1740246724

This comment shows up on every article that describes AI doing something. We know. Nobody really thinks that AI is sentient. It's an article in Time Magazine, not an academic paper. We also have articles that say things like "A car crashed into a business and injured 3 people" but nobody hops on to post: "Well, ackshually, the car didn't do anything, as it is merely a machine. What really happened is a person provided input to an internal combustion engine, which propelled the non-human machine through the wall. Don't anthropomorphize the car!" This is about the 50th time someone on HN has reminded me that LLMs are not actually thinking. Thank you, but also good grief!

60654 · 2025-02-22T16:01:26 1740240086

Absolutely. They hooked up an LM and asked it to talk like it's thinking. But LMs like GPT are token predictors, and purely language models. They have no mental model, no intentionality, and no agency. They don't think.

This is pure anthropomorphization. But so it always is with pop sci articles about AI.

delusional · 2025-02-22T16:12:00 1740240720

It's quite an odd setup. If we presuppose the "agent" is smart enough to knowingly cheat, would it then also not be smart enough to knowingly lie?

All I really get out of this experiment is that there are weights in there that encode the fact that it's doing an invalid move. The rules of chess are in there. With that knowledge it's not surprising that the most likely text generated when doing an invalid move is an explanation for the invalid move. It would be more surprising if it completely ignored it.

It's not really cheating, it's weighing the possibility of there being an invalid move at this position, conditioned by the prompt, higher than there being a valid move. There's no planning, it's all statistics.

philipov · 2025-02-22T16:35:45 1740242145

> It's not really cheating

The chorus line of every human ever attempting to rationalize cheating.

exitb · 2025-02-22T16:07:46 1740240466

You could create a non-intelligent chess playing program that cheats. It’s not about the scratchpad. It’s trying to answer a question if a language model, given an opportunity, could circumvent the rules over failing the task.

PaulDavisThe1st · 2025-02-22T16:30:37 1740241837

> could circumvent the rules over failing the task.

or the whole thing is just a reflection of the rules being incorrectly specified. As others have noted, minor variations in how rules are described can lead to wildly different possible outcomes. We might want to label an LLM's behavior as "circumventing", but that may be because our understanding of what the rules allow and disallow is incorrect (at least compared to the LLM's "understanding").

IshKebab · 2025-02-22T19:15:30 1740251730

Nobody had a problem with people saying that computers are "thinking" before LLMs existed. This is tedious and meaningless nitpicking.

philipov · 2025-02-22T16:06:27 1740240387

I suspect that this commonplace notion about the depth of our own mental models is being overly generous to ourselves. AI has a long way to go with working memory, but not as far as portrayed here.

Vecr · 2025-02-22T16:00:10 1740240010

Does it matter? If the system does something, the system does something.

https://news.ycombinator.com/item?id=42625158

betimsl · 2025-02-22T16:56:25 1740243385

They also down vote you in herds ;)

techorange · 2025-02-22T16:02:19 1740240139

I mean, I think anthropomorphism is appropriate when these products are primarily interacted with through chat, introduce themselves “as a chatbot”, with some companies going so far as to present identities, and one of the companies building these tools is literally called Anthropic.

haltingproblem · 2025-02-21T15:40:03 1740152403

Incredible feat. Some of the orbital diagrams were shared on a handle on X[1]. I would love to understand how they came up with this - was this done by humans perfecting it by trial and error or they wrote an optimization program.

[1] https://x.com/AJ_FI/status/1883541745243181162

haltingproblem · 2025-02-10T04:03:51 1739160231

https://archive.ph/M9Jt8