Is it official then? Most of us have been waiting for this moment for a while. T...

PaulRobinson · 2025-02-28T14:49:10 1740754150

It's worth pointing out that GPT-4.5 seems focused on better pre-training and doesn't include reasoning.

I think GPT-5 - if/when it happens - will be 4.5 with reasoning, and as such it will feel very different.

The barrier, is the computational cost of it. Once 4.5 gets down to similar costs to 4.0 - which could be achieved through various optimization steps (what happened to the ternary stuff that was published last year that meant you could go many times faster without expensive GPUs?), and better/cheaper/more efficient hardware, you can throw reasoning into the mix and suddenly have a major step up in capability.

I am a user, not a researcher of builder. I do think we're in a hype bubble, I do think that LLMs are not The Answer, but I also think there is more mileage left in this path than you seem to. I think automated RL (not HF), reasoning, and better/optimal architectures and hardware mean there is a lot more we can get out of the stochastic parrots, yet.

highfrequency · 2025-02-28T15:30:00 1740756600

Is it fair to still call LLMs stochastic parrots now that they are enriched with reasoning? Seems to me that the simple procedure of large-scale sampling + filtering makes it immediately plausible to get something better than the training distribution out of the LLM. In that sense the parrot metaphor seems suddenly wrong.

I don’t feel like this binary shift is adequately accounted for among the LLM cynics.

whimsicalism · 2025-02-28T15:49:55 1740757795

it was never fair to call them stochastic parrots and anybody who is paying any attention knows that sequence models can generalize at least partially OOD

zamadatix · 2025-02-28T23:55:32 1740786932

OOD = Out-of-Distribution = when a model encounters inputs which differ from data it was trained on.

For anyone else not familiar with the acronym of the day :).

aoeusnth1 · 2025-02-28T16:06:22 1740758782

Or equivalently, it vastly underestimates the intelligence of parrots

fnordpiglet · 2025-02-28T18:37:25 1740767845

Anyone who has studied Monte Carlo methods and stochastic differential equations and their applications and stochastic algorithms never found “stochastic parrot” a pejorative. In a very real way determinism is a requirement for a small mind that can’t get comfortable or understand advanced probability theory and its application.

whimsicalism · 2025-03-01T15:12:29 1740841949

anyone who read the papers where the term was introduced knows it was clearly intended as a pejorative.

i’m not sure if you intended to call those upthread small-minded

joe_the_user · 2025-02-28T19:29:47 1740770987

Weird the section of people wanting fairness to LLMs.

If it makes you feel better, I'd say the Eliza Effect is good evidence human have a lot of "stochastic parrot" in them also. And there's no reason that being stochastic parrot means something can't generalize.

The thing with these terms is LLMs are distinctly new things. Even blind men looking at elephants can improve their performance with good terminology and by listening to each other. "Effective searchers", "question answers" and "stochastic parrots" are useful term just 'cause the describe concrete behaviors - notably "stochastic parrots" gives some idea of the "no particular goal" quality of LLMs (will happily be NAZIs, pacifists or communists given the proper context). On the other hand, "intelligent" gives no good clues since humans haven't really defined the term for themselves and it is a synonym for good, worthy or capable (giving the machine a prize rather than looking at it).

whimsicalism · 2025-03-01T15:15:23 1740842123

I don’t disagree with your comment, but if you read the papers where the term was introduced that is very clearly not what they have in mind with the phrase “stochastic parrot.”

JohnKemeny · 2025-02-28T18:40:04 1740768004

They are not enriched with reasoning, it's just snake oil, I'm afraid.

zamadatix · 2025-02-28T19:00:24 1740769224

I'd like to say that with my gut but, at the same time, I've not actually seen a solid definition of what process would define reasoning to say "and this could never be it in any way!". If anything, "a iterative noisy search of similar outputs" now feels at least a big part of what the process of reasoning might need to involve.

glenstein · 2025-02-28T21:09:47 1740776987

>the barrier, is the computational cost of it. Once 4.5 gets down to similar costs to 4.0

Well, did 4.0 ever become lower cost? On the API side, its cost per tokens is a factor of 10 higher than 4o even though 4o is considered the better model.

I think 4.5 may just be retired wholesale, or perhaps a new model derived from it that is more efficient, a 4.5mini or something like that.

km144 · 2025-02-28T13:59:02 1740751142

I'm not convinced that LLMs in their current state are really making anyone's lives much better though. We really need more research applications for this technology for that to become apparent. Polluting the internet with regurgitated garbage produced by a chat bot does not benefit the world. Increasing the productivity of software developers does not help to the world. Solving more important problems should be the priority for this type of AI research & development.

pera · 2025-02-28T14:55:02 1740754502

The explosion of garbage content is a big issue and has radically changed the way I use the web over the past year: Google and DuckDuckGo are not my primary tools anymore, instead I am now using specialized search engines more and more, for example, if I am looking for something I believe can be found in someone's personal blog I just use Marginalia or Mojeek, if I am searching for software issues I use GitHub's search, general info straight to Wikipedia, tech reviews HN's Algolia etc.

It might sound a bit cumbersome but it's actually super easy if you assign search keywords in your browser: for instance if I am looking for something on GitHub I just open a new tab on Firefox and type "gh tokio".

Workaccount2 · 2025-02-28T14:58:28 1740754708

LLM's have been extremely useful for me. They are incredibly powerful programmers, from the perspective of people who aren't programmers.

Just this past week claude 3.7 wrote a program for us to use to quickly modernize ancient (1990's) proprietary manufacturing machine files to contemporary automation files.

This allowed us to forgo a $1k/yr/user proprietary software package that would be able to do the same. The program Claude wrote took about 30 mins to make. Granted the program is extremely narrow in scope, but it does the one thing we need it to do.

This marks the third time I (a non-progammer) have used an LLM to create software that my company uses daily. The other two are a test system made by GPT-4 and an android app made by a mix of 4o and claude 3.5.

Bumpers may be useless and laughable to pro bowlers, but a godsend to those who don't really know what they are doing. We don't need to hire a bowler to knock over pins anymore.

Kye · 2025-02-28T16:07:06 1740758826

Being able to quickly get a script for some simple automation, defining source and target formats in plain English, has been a huge help. There is simply no way I'm going to remember all that stuff as someone who doesn't program regularly, so the previous way to deal with it was to do it all manually. It was quicker than doing remedial Python just to forget it all again.

unshavedyak · 2025-02-28T15:53:48 1740758028

I've also been toying with Claude Code recently and i (as en eng, ~10yr) think they are useful for pair programming the dumb work.

Eg as i've been trying Claude Code i still feel the need to babysit it with my primary work, and so i'd rather do it myself. However while i'm working if it could sit there and monitor it, note fixes, tests and documentation and then stub them in during breaks i think there's a lot of time savings to be gained.

Ie keep the doing simple tasks that it can get right 99% of the time and get it out of the way.

I also suspect there's context to be gained in watching the human work. Not learning per say, but understanding the areas being worked on, improving intuition on things the human needs or cares about, etc.

A `cargo lint --fix` on steroids is "simple" but still really sexy imo.

km144 · 2025-02-28T17:23:21 1740763401

I think that's great for work and great for corporations. I use AI at my job too, and I think it certainly does increase productivity!

How does any of this make the world a better place? CEOs like Sam Altman have very lofty ideas about the inherent potential "goodness" of higher-order artificial intelligence that I find thus far has not borne out in reality, save a few specific cases. Useful is not the same as good. Technology is inherently useful, that does not make it good.

dgsm98 · 2025-02-28T14:09:20 1740751760

> Solving more important problems should be the priority for this type of AI research & development.

Which problem spaces do you think are underserved in this aspect?

rjinman · 2025-02-28T13:59:55 1740751195

As someone who is terrified of agentic ASI, I desperately hope this is true. We need more time to figure out alignment.

cle · 2025-02-28T14:27:10 1740752830

I'm not sure this will ever be solved. It requires both a technical solution and social consensus. I don't see consensus on "alignment" happening any time soon. I think it'll boil down to "aligned with the goals of the nation-state", and lots of nation states have incompatible goals.

rjinman · 2025-02-28T14:55:06 1740754506

I agree unfortunately. I might be a bit of an extremist on this issue. I genuinely think that building agentic ASI is suicidally stupid and we just shouldn’t do it. All the utopian visions we hear from the optimists describe unstable outcomes. A world populated by super-intelligent agents will be incredibly dangerous even if it appears initially to have gone well. We’ll have built a paradise in which we can never relax.

Terretta · 2025-02-28T15:35:27 1740756927

What's the difference between your "agentic AIs" and, say, "script kiddies" or "expert anarchist/black-hat hackers"?

It's been obvious for a while that the narrow-waist APIs between things matter, and apparent that agentic AI is leaning into adaptive API consumption, but I don't see how that gives the agentic client some super-power we don't already need to defend against since before AGI we already have HGI (human general intelligence) motivated to "do bad things" to/through those APIs, both self-interested and nation-state sponsored.

We're seeing more corporate investment in this interplay, trending us towards Snow Crash, but "all you have to do" is have some "I" in API be "dual key human in the loop" to enable a scenario where AGI/HGI "presses the red button" in the oval office, nuclear war still doesn't happen, WarGames or Crimson Tide style.

I'm not saying dual key is the answer to everything, I'm saying, defenses against adversaries already matter, and will continue to. We have developed concepts like air gaps or modality changes, and need more, but thinking in terms of interfaces (APIs) in the general rather than the literal gives a rich territory for guardrails and safeguards.

rjinman · 2025-02-28T16:03:57 1740758637

> What's the difference between your "agentic AIs" and, say, "script kiddies" or "expert anarchist/black-hat hackers"?

Intelligence. I'm talking about super-intelligence. If you want to know what it feels like to be intellectually outclassed by a machine, download the latest Go engine and have fun losing again and again while not understanding why. Now imagine an ASI that isn't confined to the Go board, but operating out in the world. It's doing things you don't like at speeds you can scarcely comprehend and there's not a thing you can do about it.

semi-extrinsic · 2025-02-28T21:27:24 1740778044

But the world is not a game where you "win" by intelligence; very far from it. Just look at who is currently in the White House.

tmiku · 2025-02-28T21:36:24 1740778584

> Now imagine an ASI that isn't confined to the Go board, but operating out in the world.

I don't think it's reasonable at all to look at a system's capability in games with perfect and easily-ingested information and extrapolate about its future capabilities interacting with the real world. What makes you confident that these problem domains are compatible?

rjinman · 2025-03-01T06:45:48 1740811548

That’s not what I was saying at all. I was using Go as an example of what the experience of being helplessly outclassed by a superior intelligence is like: you are losing and you don’t know why and there’s nothing you can do.

rybosworld · 2025-02-28T16:53:14 1740761594

I completely agree with you. Chess/Go/Poker have shown that these systems can become so advanced, it becomes impossible for a human to understand why the AI chose a move.

Talk to the best chess players in the world and they'll tell you flat out they can't begin to understand some of the engine's moves.

It won't be any different with ASI. It will do things for reasons we are incapable of understanding. Some of those things, will certainly be harmful to humans.

rybosworld · 2025-02-28T16:58:59 1740761939

> What's the difference between your "agentic AIs" and, say, "script kiddies" or "expert anarchist/black-hat hackers"?

The difference is that a highly intelligent human adversary is still limited by human constraints. The smartest and most dangerous human adversary is still one we can understand and keep up with. AI is a different ball game. It's more similar to the difference in intelligence between a human and a dog.

gom_jabbar · 2025-02-28T15:25:49 1740756349

> we just shouldn’t do it.

I think what Accelerationism gets right is that capitalism is just doing it - autonomizing itself - and that our agency is very limited, especially given the arms race dynamics and the rise of decentralized blockchain infrastructure.

As Nick Land puts it, in his characteristically detached style, in A Quick-and-Dirty Introduction to Accelerationism:

"As blockchains, drone logistics, nanotechnology, quantum computing, computational genomics, and virtual reality flood in, drenched in ever-higher densities of artificial intelligence, accelerationism won't be going anywhere, unless ever deeper into itself. To be rushed by the phenomenon, to the point of terminal institutional paralysis, is the phenomenon. Naturally — which is to say completely inevitably — the human species will define this ultimate terrestrial event as a problem. To see it is already to say: We have to do something. To which accelerationism can only respond: You're finally saying that now? Perhaps we ought to get started? In its colder variants, which are those that win out, it tends to laugh." [0]

[0] https://retrochronic.com/#a-quick-and-dirty-introduction-to-...

drdaeman · 2025-02-28T21:30:05 1740778205

It doesn't do anyone any good to stress over non-existent things. ASI is a sci-fi trope, a pure fantasy in context of present day and time. AGI does not exist either, and AFAIK there's not even any agreement what it possibly means beyond very vague "no worse than a human".

In other words, I'm sure you're terrified of a modern fairy tale.

rgbrenner · 2025-02-28T16:47:55 1740761275

"alignment" is a bs term made up to deflect blame from the overpromises the AI companies made to hype up their product to obtain their valuations.

DirkH · 2025-02-28T20:08:58 1740773338

Big take given how much AI companies hate alignment folks.

fergonco · 2025-02-28T09:41:00 1740735660

> will nonetheless make people's lives better

Probably not the lives of translators or graphic designers or music compositors. They will have to find new jobs. As llm prompt engineers, I guess.

yurishimo · 2025-02-28T11:22:38 1740741758

Graphic designers I think are safe, at least within organizations that require a cohesive brand strategy. Getting the AI to respect all of the previous art will be a challenge at a certain scale.

Fiverr graphic designers on the other hand…

andy_ppp · 2025-02-28T15:21:19 1740756079

Getting graphic designers to use the design system that they invented is quite a challenge too if I'm honest... should we really expect AI to be better than people? Having said that AI is never going to be adept at knowing how and when to ignore the human in the loop and do the "right" thing.

bearjaws · 2025-02-28T16:25:19 1740759919

There are people generating mostly consistent AI porn models using LORA, the same strategy could be used to bias the model towards consistent output for corporate branding.

Even if its not perfect, many startups will be using AI to generate their branding for the first 5 years and put others out of a job.

Right now the tools are primitive, but leave it to the internet to pioneer the way with porn...

whimsicalism · 2025-02-28T14:51:21 1740754281

absolutely a solvable problem even with no tech advances

vbezhenar · 2025-02-28T15:11:24 1740755484

I feel like it was GPT-5 which was eventually renamed to keep up with expectations.

diego_sandoval · 2025-02-28T17:20:52 1740763252

> OpenAI knowing how to build AGI and similar outlandish claims.

The fact that the scaling of pretrained models is hitting a wall doesn't invalidate any of those claims. Everyone in the industry is now shifting towards reasoning models (a.k.a. chain of thought, a.k.a. inference time reasoning, etc.) because it keeps scaling further than pretraining.

Sam said the phrase you refer to [1] in January, when OpenAI had already released o1 and was preparing to release o3.

[1] https://blog.samaltman.com/reflections

sebzim4500 · 2025-02-28T17:20:17 1740763217

This seems very dramatic given OpenAI still has the best model in the world `o3`.

oblio · 2025-03-01T09:26:12 1740821172

The best model in the world is still basically a very stubborn, yet mediocre 16 year old with a memory the size of the internet.

entropi · 2025-02-28T09:47:51 1740736071

> will nonetheless make people's lives better

While I mostly agree with your assessment, I am still not convinced of this part. Right now, it may be making our lives marginally better. But once the enshittification starts to set in, I think it has the potential to make things a lot worse.

E.g. I think the advertisement industry will just love the idea of product placements and whatnots into the AI assistant conversations.

dkdcwashere · 2025-02-28T12:14:22 1740744862

*good*. the answer to this is legislation —- legally, stop allowing shitty ads everywhere all the time. I hope these problems we already have are exacerbated by the ease of generating content with LLMs and people actually have to think for themselves again

aurareturn · 2025-02-28T14:34:22 1740753262

Honestly, I'm not sure how you can make all those claims when:

1. OpenAI still has the most capable model in o3

2. We've seen some huge increases in capability in 2024, some shocking

3. We're only 3 months into 2025

4. Blackwell hasn't been used to train a model yet

NotYourLawyer · 2025-02-28T15:32:45 1740756765

> Not much consolation to the world's super rich who will lose tons of money once the LLM industry (let us remember that AI is not LLM) falls.

They knew the deal:

“it would be wise to view any investment in OpenAI Global, LLC in the spirit of a donation” and “it may be difficult to know what role money will play in a post-[artificial general intelligence] world.”

qoez · 2025-02-28T17:14:51 1740762891

It's always been a combination of data and scale (garbage data on massive scale gives garbage still). Data is continually getting better though so we'll still be able to squeeze a lot out of transformers yet

mountainriver · 2025-02-28T18:38:03 1740767883

lol this isn’t a reasoning model, those are doing very well, but cute essay you wrote there