What does Alan Kay think about LLMs?

ozten · on March 19, 2024

> That humans also do this all the time is “interesting”, “dangerous” etc., but it is also why trying to move from superstition (this is actually what “reasoning by correlation” amounts to) to more scientific methods is critical for anything like civilization to be created.

"reasoning by correlation" as superstition is a brutal insight.

ein0p · on March 19, 2024

It’s great that he recognizes that most people aren’t to be trusted either. Most people outside the field are talking about “safety” as though humans never lie or make up bullshit, whereas in reality I trust GPT4 much more than I’d trust an average human, just not 100%. Literally people keep hyperventilating about “fake news” generation even though their entire news consumption already consists solely of pre approved narratives, many of which are fake, by omission or otherwise.

I can’t wait for the day when ultra long context LLMs are able to point out omissions and factual contradictions in human produced fake news for any given event

ribosometronome · on March 19, 2024

Are context windows not long enough to do this for most news articles and forum posts nowadays? I'd've thought the issue would be finding the relevant and reliable sources of truth for comparison would be the difficult part.

ein0p · on March 19, 2024

That is difficult, yes. The way the game is played is a handful of large outlets promote themselves as “trusted” and then drown out everything else as “untrustworthy”. But the fly in the ointment for them is that even these large/sophisticated opinion manipulators (they shouldn’t be called news sources at this point) are inconsistent both over time and sometimes also in the moment. Today’s public short attention span prevents it from seeing such inconsistencies or from paying attention to the broader picture. But guess what today’s models easily outperform humans on? The ability to pay attention to more than a small handful of things for longer than a few minutes.

eska · on March 25, 2024

The comparison is not between an LLM and an average human but an LLM and an expert on a subject.

kergonath · on March 19, 2024

> "reasoning by correlation" as superstition is a brutal insight.

I don’t think he is right on that one, though. Reasoning by correlation is a kind of empiricism, and it can be tested. “These things happen together” or “if I do this, then that happens” can be disproven with statistical analysis even without any understanding of the underlying mechanisms at play their causes.

Superstition is beyond that; it is a belief that cannot be proven and that is not based on logic. The problem with superstition is not that there is no causal link between the supposed cause and the effect, it’s that there is not even a correlation.

atoav · on March 19, 2024

The problem here is that correlation is not a hard term for the people doing drawing the lines. People who take homeopathics for example will assure you they always get healthy when they took the sugar pills. What they fail to see is that they would have gotten healthy without it as well.

If you always rub your lucky penny before playing a sports game a superstitious person would account wins for the penny and excuse losses by not having rubbed the penny enough, the right way, not concentrated enough. This also feeds into correlation.

We are not talking about things actually being correlated, we are talking about people believing they are. Those are two wildly different things one is physical reality and doesn't change when you ignore it, the other is bound to individuals and tbeir world view and can take on absolutely ridiculous forms.

Phiwise_ · on March 19, 2024

A road trip by random walk is also a kind of empiricism that can be tested. This does not imply it is worth the energy to actually do.

sfpotter · on March 19, 2024

A road trip by random walk is a valuable experience, the kind most people on HN lack.

Phiwise_ · on March 19, 2024

Call me crazy, but I don't think the goal of AI should be to give the rock with electricity in it "a valuable experience".

sfpotter · on March 19, 2024

What are you even talking about?

Phiwise_ · on March 20, 2024

The right way to build AI. You know, the topic of this thread. I've been wondering what you're talking about, though.

halayli · on March 19, 2024

Superstition is assuming a causation when you identify a correlation.

brookst · on March 19, 2024

Often followed by applying confirmation bias to see correlations when there are none.

bsaul · on March 19, 2024

i think this is the part of superstition. Reproductive, effective, correlation is the start of a scientific path to true discoveries. Superstition only begins when the correlation has never truly never been observed, but only repeated from generations to generations without any other kind of justification other than "that's what i heard".

vinnyvichy · on March 21, 2024

There is also, fashionable on some message boards: assuming a lack of causation after only seeing the title of a report of correlation..

(One may also link fashion/accusations of fashionability etc etc to superstition..)

kergonath · on March 21, 2024

Yes, some people confuse contrarianism for cynicism and both for wisdom.

n4r9 · on March 19, 2024

There are plenty of superstitions that are perfectly falsifiable. People continue to believe in them for comfort. For example, wearing odd socks to a sports match helps your team win. Or, leaving a fan on overnight might kill you.

alexvoda · on March 19, 2024

I think the author and you are using different meanings for "reasoning".

We do know that many people even in professional fields confuse correlation with causation. And even when that doesn't happen, when only doing measurments, correlation is often considered good enough without giving any thought to the underlying mechanism. This may be the root of Goodhart's law stating that when a measure becomes a target it stops being a good measure. The measure was only correlated with the behaviour the person doing the measuring thought was measuring.

Superstition is similar, but drops any shred of statistical rigour, relying on mere anecdotal evidence.

kurthr · on March 19, 2024

It seems like the other part of this is the ability of LLMs to rationalize even answers that are not rational (he says piling BS on BS). That has previously been a very human (and quite distracting/confusing) activity and it's very effective on most people most of the time.

discreteevent · on March 19, 2024

Here's an excellent recent comment along the same lines (in response to whether our intelligence is statistically based like LLMs):

> mjburgess 9 days ago

Suppose you touch a fireplace once, do you touch it again? No.

OK, here's something much stranger. Suppose you see your friend touch the fireplace, he recoils in pain. Do you touch it? No.

Hmm... whence statistics? There is no frequency association here, in either case. And in the second, even no experience of the fireplace.

The entire history of science is supposed to be about the failure of statistics to produce explanations. It is a great sin that we have allowed pseudosciences to flourish in which this lesson isnt even understood; and worse, to allow statistical showmen with their magic lanterns to preach on the scientific method. To a point where it seems, almost, science as an ideal has been completely lost.

The entire point was to throw away entirely our reliance on frequency and association -- this is ancient superstition. And instead, to explain the world by necessary mechanisms born of causal properties which interact in complex ways that can never uniquely reveal themselves by direct measurement.

https://news.ycombinator.com/item?id=39661963

codethatwerks · on March 19, 2024

Causation is just correlation, and a good story. :-)

skadamat · on March 19, 2024

This answer is amazing and classically Alan Kay! There's so much here to unpack because of all the different areas Alan draws from in his work (he's like a computing philosopher).

All I will say is that for people who want to understand his perspective, there's a large epistemological load to overcome. Sampling his talks is a good starting point though: https://tinlizzie.org/IA/index.php/Talks_by_Alan_Kay

layer8 · on March 19, 2024

He always criticizes how everyone got computing wrong (fair enough), but never offers anything but the most vague suggestions of how to do things differently.

joshmarinacci · on March 19, 2024

His entire 60 year career is a suggestion of how to do things differently. ;)

https://en.wikipedia.org/wiki/Alan_Kay

vinnyvichy · on March 21, 2024

Eventually, after distancing themselves from their curmudgeonly creator, his ideas land in the mainstream. I'm convinced, eg, that what's loved about python has been elucidated by AK at some point, but maybe it's even good marketing not to attribute anything to him ;)

Osmose · on March 19, 2024

He has generally great insights about principles to aim for, which is part of the answer, but no, he does not have the entire answer. His insights are inspiration for others who contribute their own parts to potential answers—that's how open collaboration works.

fmap · on March 19, 2024

Others have already mentioned this, but Alan Kay spent his entire career experimenting with approaches to radically simplify computing. The progress reports from vpri give a good overview of a recent project that has since wrapped up, e.g.: https://tinlizzie.org/VPRIPapers/tr2012001_steps.pdf

YeGoblynQueenne · on March 19, 2024

>> Others have already mentioned this, but Alan Kay spent his entire career experimenting with approaches to radically simplify computing.

Yes. And yet, he created OOP [1]. Strange.

________________

[1] Not on his own.

jdlshore · on March 19, 2024

You’re saying this like it’s a “gotcha,” which makes me believe you’re confusing OOP-as-practiced-in-enterprise with OOP-as-invented-by-Kay.

The sad truth is that most great ideas are doomed to drown in a sea of mediocrity and misinterpretation.

YeGoblynQueenne · on March 19, 2024

I'm not confusing anything. OOP is a horribly OOT over-engineering of a problem looking for a solution, from the get go. That it morphed and twisted into a bloated mess when it was taken up by people who didn't get it to begin with is just the inevitable result of starting with a big pile of no need to do all that in the first place.

Phiwise_ · on March 20, 2024

[Citation Needed]

sebastianconcpt · on March 19, 2024

He did tho. He mentioned more than once that OOP was more about message passing than behavior + state.

layer8 · on March 19, 2024

Yes, that’s what I mean by vague suggestions. How do you actually build and maintain a system of any complexity with that? How do you ensure it will do what it is supposed to do? And Smalltalk wasn’t that, it is mostly just regular method calls and not “message passing”.

chongli · on March 19, 2024

He’s not just a proponent of message passing, he’s a proponent of late binding [1]. The idea there is to have a running image of the environment and interact with it in real time, updating code while it’s running (no recompiling or anything like that). The idea is a high level of interactivity with very tight feedback loops.

A whole operating system like this would allow you to hack on the user interface and change things on the fly. All of your software would also work like this, promoting open extensibility by the user. You ought to be able to click on any window or other user interface element and be able to view and modify the running code live, without even restarting the computation it’s running (never mind the whole program or even rebooting the computer).

This sort of ability to do live hacking on the internals of what you’re working with is how computers used to work in those early days. It’s also how machinery has always worked in the past. A mechanic could lock in the timing of an engine by rotating the distributor cap and listening to how smoothly it fires.

[1] https://en.wikipedia.org/wiki/Late_binding

bombcar · on March 19, 2024

The closest I’ve seen to this is programming in Excel - each step is visible to you and you can tweak it in the cell itself and see what it does.

yourapostasy · on March 19, 2024

I wonder if all of that was in an operating system as described but backed by Nix so we accomplish the NixOS-style reproducibility and configuration time travel, what the developer experience and user experience would be like. Instead of binary blob images, we'd have Nix declaratives and flakes we dissect at will, without fear of wedging the system with the live modifications we make.

Our hardware is sufficiently capable enough these days that I'm curious if we can do this to conventional Smalltalk and Common Lisp system/machine designs to re-imagine them, and bring back a level of tight feedback looped developer experience that has gone underground in the mainstream.

yourapostasy · on March 19, 2024

I wonder if all of that was in an operating system as described but backed by Nix so we accomplish the NixOS-style reproducibility and configuration time travel, what the developer experience and user experience would be like. Instead of binary blob images, we'd have Nix declaratives and flakes we dissect at will, without fear of wedging the system with the live modifications we make.

satellite2 · on March 19, 2024

A browser kind of fit the requirements.

Phiwise_ · on March 19, 2024

Kay himself does regularly describe the Arpanet as one of the inspurations for OO, and the modern interbet browser as an incomplete implementation of the same ideas: https://youtu.be/1e8VZlPBx_0

devinprater · on March 20, 2024

Holy crap that'd be amazing for accessibility! Open a menu and look at what the button does, label that button with the screen reader, pass around that label to others, and boom, labeled button.

projektfu · on March 19, 2024

I disagree, not because methods do not get called, but because the Smalltalk sender is not calling a method, it is sending a message. It says to the runtime, here is an object and I want to send "#collect:" to it using this other object obj as a parameter. The runtime looks in the object's class for a method called "collect:" and, if found, calls the method with the receiver and obj. If not found, it will look in the superclasses until it cannot find it, then it will send "#doesNotUnderstand:" with the context, and go up the class hierarchy again. Because of the message-passing paradigm, the receiver can make behaviors that are unusual, such as proxying and delegation, in a more or less transparent fashion.

If the sender is calling a method, then that is a static arrangement and the result is not different from a function call.

vanviegen · on March 19, 2024

How does what you're describing differ from polymorphism, or from doing OOP in loosely typed languages like javascript?

fzzzy · on March 20, 2024

Every single object in smalltalk is free to have its own implementation of handling a message. In javascript, the runtime decides for the object how to locate the code and how to call it.

Phiwise_ · on March 19, 2024

Does a full free operating system, all the way down to the metal, and hypermedia publishing suite in 936kB count as "a system of any complexity" that "will do what it is supposed to do" to you? If not, what specifically is your standard? https://youtu.be/BUud1gcbS9k

Xelynega · on March 19, 2024

Erlang(and more modernly elixir/Gleam) are an example of this in practice, and were created to make it easier to ensure the working and manage the complexity of large systems.

layer8 · on March 19, 2024

Fair enough. But Alan Kay doesn’t say “use Erlang”, nor does he go into any depths about what the desirable feature set is and what would make it work or would fail to make it work. He’s always just waxing around nebulous desiderata and complaining about the status quo.

Phiwise_ · on March 19, 2024

>But Alan Kay doesn’t say “use Erlang”, nor does he go into any depths about what the desirable feature set is and what would make it work or would fail to make it work.

Oh yeah? Then how do you explain this lecture where he says explicitly that Erlang should be the modern programmer's assembly?

https://youtu.be/fhOHn9TClXY

In my experience, people who criticize Kay for being too vague or kvetching too pointlessly or lacking practical experience in computer science today almost always have not bothered to familiarize themselves with the vast majority of his work before deciding their opinion on it. It's hard to disagree with him that this makes us more like a pop culture than a profession after seeing it.

alexvoda · on March 19, 2024

You hit the nail head on. The field of IT is a pop culture and consequently just as prone to fads and periodic fashions.

Phiwise_ · on March 19, 2024

If only. Unix would be dead and buried many years ago if it was just a fad. It's far more like the old repeating fertility rituals of many paganistic societies, where your neighbors burn your pagus down to cleanse the village of the spite of the undead grain god if you plowed your heath in a different direction from everyone else.

sebastianconcpt · on March 19, 2024

Just adding here that the understanding of the forces at play in a cultural war makes what you just said even more clear and for a vast spectrum of fields way beyond IT.

jhbadger · on March 19, 2024

It's somewhat of a backhanded compliment though -- calling Erlang a "modern assembly" is like the quip that C is a "portable assembler" -- that is, a tool useful for low-level stuff but which you should mostly use to implement higher level languages on top of.

Phiwise_ · on March 20, 2024

I don't agree with this in the context of the full video. I'd encourage you to watch it; it's quite good. It's clear Armstrong and Kay have a great deal of respect for each other.

pests · on March 19, 2024

> He’s always just waxing around nebulous desiderata and complaining about the status quo.

How is inventing Smalltalk and putting his ideas into reality just "waxing around" or "complaining"?

MisterTea · on March 19, 2024

Maybe because there is no one/right/simple answer. You really have to think for yourself and seek out answers instead of expecting them. You're doing that now but with the mistaken impression you were to be handed them.

asdajksah2123 · on March 19, 2024

If he said "use Erlang" then that eliminates the actual message he's trying to convey, which is message passing is good.

I bet if he simply said "use Erlang", 99% of headlines and discussion would be "Alan Kay said Erlang is the greatest language evah!"

I do appreciate that he respects his audience, ie me, enough that he thinks we can read "message passing is good" and go from there to choosing a message passing language that is suitable for our needs, or even using a non message passing language due to other factors, but recognizing that OOP is about message passing and not state encapsulation, which would impact how I code even in Java.

joshmarinacci · on March 19, 2024

Smalltalk is definitely message passing. There are many ways to intercept the messages and implement different responses. This one mechanism enables: network aware code, object proxies, sub-classing, on-the-fly code generation, and lots of other behavior. Traditional methods were invented later as less robust version of Smalltalk's message passing.

hintymad · on March 19, 2024

SmallTalk? Even though the language is mainstream, its many ideas, including implementation and design style using message passing, are embedded in many languages and software packages.

the_af · on March 19, 2024

> Yes, that’s what I mean by vague suggestions. How do you actually build and maintain a system of any complexity with that?

Reminds me of the RESTful guy, Roy Fielding (though I respect and like Alan Kay way more). If you've read any of his online interactions, apparently nobody does REST like he envisioned. It simply doesn't exist in the wild. When asked about some REST API, he'll claim it's not RESTful, it's wrong, that's not what he meant, etc. "But what would be right?", you ask, and he'll reply "Read my paper! HATEOAS! (Hypermedia as the engine of application state)". "Ok, but what does it mean in practice, how do I go about building an API with HATEOAS?"...

...and there's no answer for that besides "read my dissertation". He can only tell you what you're doing wrong, but nobody has been able to build a "true" RESTful API to Roy Fielding's satisfaction.

cxr · on March 19, 2024

> the RESTful guy, Roy Fielding

I think you're confused about who Fielding is and what REST is about. His writing isn't about building "true" RESTful APIs. It isn't really about APIs per se at all, RESTful or not. (Further, Fielding to my knowledge isn't even known for his use of the word "RESTful".) It sounds like you've probably been misled by a bunch of people who aren't Fielding about what's in his dissertation. Not unreasonable in those circumstances that one be told to actually read the thing instead of guessing at the gestalt of it based on ambient chatter which is by and large very misinformed.

Having said that, Fielding's writing is not something I would exalt for its clarity. I'm partial to jcrites's explanatory powers: "REST describes how the Web works" <https://news.ycombinator.com/item?id=23672561>

fzzzy · on March 19, 2024

It's incredibly easy to build hypermedia as the engine of application state sites. You generate html pages with state in them, and hyperlinks and forms that filter and change the state.

Hateoas didn't invent anything. He literally just described how web 1.0 worked.

carapace · on March 19, 2024

You are impressively wrong. He just wrapped up an incredible research program that demonstrated many concrete implementations of great computing ideas.

> After 16 years of continuous research and important contributions toward its mission - "Improve 'powerful ideas education' for the world's children and to advance the state of systems research and personal computing" - Viewpoints Research Institute concluded its operations at the beginning of 2018.

http://www.vpri.org/

layer8 · on March 19, 2024

And what exactly is the output, other than talks and some programming learning tools for children? I haven't seen any significant new insights or advances.

carapace · on March 20, 2024

> And what exactly is the output, other than talks and some programming learning tools for children?

The papers and demos are the output. They were a research outfit, not a startup. OP claimed that Kay "never offers anything but the most vague suggestions" when he (et. al.) provided several very concrete working demos. It's hardly his fault if no one took him up on them, is it? (FWIW I suspect some of their work had an effect on MS Word & Excel UI but I don't know for sure.)

> I haven't seen any significant new insights or advances.

Where have you looked? Did you read the papers that VPRI published? OMeta has been mentioned else-thread, I like that Nile programming language, the COLA system seems neat.

theGnuMe · on March 19, 2024

True message passing is one of the key ideas behind smalltalk which would improve computer security considerably. This is analogous to the design of the internet that he references. The internet scales well. So imagine building entire computing systems this way.

vajrabum · on March 19, 2024

If smalltalk is late binding and message passing, I think it's interesting that modern large scale webapps (say Netflix or Google) do just that just not inside process and server boundaries. I'd guess that Alan Kay would think that's a bastardization, and there are loads of complications but it's a definite improvement over the large php ball of mud monolith you might have seen at say Box or Facebook 10-15 years ago. Because the 'object' boundaries are so coarse you can also better leverage the few people inside an org who know how to design, scale, improve robustness and secure systems. If architecture is the metaphor, it's still pretty much piling mud bricks but at least there are corbeled arches, streets, city walls, public buildings and a rudimentary notion of style.

As for LLMs I think that long form typeahead is the right way to think about it.

fzzzy · on March 19, 2024

I don't think he would think it is a bastardization. HTTP is 100% message passing. You have an object reference (url) and you pass a message to it with POST. State and behavior is encapsulated. In fact, one of the desireable properties of message passing is that it is easy to distribute across process and machine boundaries.

layer8 · on March 19, 2024

How would it improve it, though? Dynamically-typed languages tend to introduce gradual typing after a while to increase reliability and predictability, and the end game of that (static typing) is the antithesis of late binding. Message passing only improves security by adding a validation layer, but managed languages (JVM, CLR, etc.) effectively do the same.

> The internet scales well. So imagine building entire computing systems this way.

That’s very similar to microservices, and look how messy these tend to get. The internet scales well because it’s a network with flexible capacity and attached processing power, and because its clients are either human users who can deal with failure modes, or software with stable and well-specified point-to-point protocols.

theGnuMe · on March 19, 2024

It’d be a big improvement. Rather than call a linked library function you’d send a message to whatever it was you needed the library function for.. this reduces the buffer overflow exploit potential and what not. That alone is a huge win.

detourdog · on March 19, 2024

I think trustability is pretty specific and without a foundation of trust the rest of the effort”s value is unknowable.

layer8 · on March 19, 2024

That’s “what we would like to have” (so mostly just a criticism that we don’t have it), but without much of a suggestion of how to actually achieve it in practice.

detourdog · on March 19, 2024

The current industry leaders of the effort want to proclaim that it is impossible to know where the answers come from. If that is the industry norm then I understand how achieving trustability could be mysterious.

My line of thinking is that without an audit trail of thoughts there won't be any trustability. I'm unable to describe in detail what any one specific thing leads to trustability. I can say that being able to demonstrate where answers come from would work for me.

agumonkey · on March 19, 2024

one real experiment he contributed to was VPRI ometa and the os built on top of that (every part of the OS built on top of an eDSL)

mempko · on March 19, 2024

What? He literally built systems that did things differently. Look at smalltalk.

agumonkey · on March 19, 2024

I think I get what parent means, Kay is mostly seen in conference mode, with simplified slides and big ideas, it's frustrating if you never saw who he got taught or worked with.

KorematsuFredt · on March 19, 2024

I wish someone could provide a tl;dr of that answer to us simpletons. His writing style is something I am unable to comprehend.

agomez314 · on March 19, 2024

wow this list is super useful, thanks for sharing!

devjab · on March 19, 2024

I use LLMs quite a lot to help me in my work, but they are wrong so often that it’s ridiculous. This isn’t a major issue when you’re an expert using the tools to be more efficient, because you’ll spot and laugh at the errors it makes. Sometimes it’ll be things that anyone would notice, like how a LLM will simply “invent” a library function that has never existed. Even if you’re not an expert, you’re not going to get that PNP-whatever function to work in Powershell if it never existed in the module to begin with.

Where it becomes more dangerous, at least in my opinion, is when the LLM only gets it sort of wrong. Maybe the answer it gives you is old, maybe it’s inefficient, maybe it’s insecure or a range of other things, and if you’re new to programming, you’re probably not going to notice. Hell, I’ve reviewed code from senior programmers that pulled in deprecated things with massive security vulnerabilities and never noticed because they were too focused on fast delivery and “it worked”. I can’t imagine how that would work out for people trying to actually learn things.

I’m not sure what we can really do about it though. I work a side gig as an external examiner for CS students. A lot of the curriculum being taught (at least here in Denmark) are things I’ve seen the industry move “beyond” in the previous 20 years. Some of it is so dated that it really makes no sense at all. Which isn’t exactly a great alternative to the LLMs, and it’s only natural that a lot of people simply turn to these powerful tools.

I tend to tell people to ask their favorite LLM to help them solve a crossword. When you ask it to give you words ending on “ing” it’ll give you words that don’t end on “ing” because of how the tokens used work. This tends to be an eye opener for people in regards to how much they trust their LLM. At least until they get refined enough that they can also do these things.

Anyway, it’s a good answer.

sumtechguy · on March 19, 2024

Yesterday I asked it the date. Then asked it if it was possible to give me the date. It gave a very different answer that contradicted what it just did. It is kind of hard to trust when it can not even make itself consistent between 2 prompts. For creative prompts this thing is very cool. For logical prompts it is frustrating to no end how much detail it gets wrong.

bongodongobob · on March 19, 2024

I'm sorry, but it sounds like you're using ChatGPT3.5. I get very very few invented libraries unless it's a very specific version of an old or rare language. I tried your "ING" example and GPT4 got 100/100.

Idk, ymmv, but GPT4 beats the pants off of 3.5.

samatman · on March 19, 2024

I use GPT4 and it confabulates constantly. Try it on a language in the 10th-20th range of popularity (let alone lower) and I expect you'll see it as well. I don't see that as often with Python, the only top-5 language I currently work with.

smartmic · on March 19, 2024

I can confirm as well. I tried something with GNU APL ... and the answers were purely ridiculous. It fabulated some totally wrong stuff based on Dyalog APL with invented libraries and more.

Okay, I didn't expect usable results here but everyone should be aware... when even non-mainstream technologies are being discriminated, what about the human spectrum?

bongodongobob · on March 19, 2024

Right, that's my experience as well. Python, C, C++, bash, PowerShell, etc, are all pretty much spot on 95/% of the time. It will start to fumble if I specify PS 4.0, or Python 2, C89, etc. Or use X language but don't use popular library Y.

devjab · on March 20, 2024

In my experience it depends a lot more on what you prompt it with rather than the language itself. As soon as you task it with something where your prompt doesn’t hit the right marks, I’ll go into its “inventive” mode and simply make stuff up. Copilot for enterprise is a little better in terms of not making things up, but the flip side of this is that it often simply provides a very, very, useless result.

My experience with it for Powershell hasn’t been as impressive as yours. It simply invented functions for the PnP module. Which I suppose isn’t in too much use these days, especially not against on-prem SharePoint. I’m not an expert on Powershell, however, and I really don’t think the documentation on things like PnP search queries is very good. So this was an area where I got to experience what it’s like to use these tools when you’re not the expert. In the end it was trial and error along with various shitty internet articles that helped me.

That being said. You can easily have GPT4 solve your CS exam questions, and likely whatever technical interview you’re given if you’re applying at jobs which do these things. And as such, I guess you could also argue that a lot of what we teach in CS is now even more dated than before. Because even if some of it is sort of useful for basic understanding, I’d bet money on students “cheating” their way through where they can. Because why wouldn’t they use the “calculator”?

slaymaker1907 · on March 19, 2024

> If we look at human anthropology, we see a species that treats solitary confinement and banishment from society as punishments — we are a society because we can cooperate and trust a little — but when are safely back in society, we start competing like mad (and cheating like mad) as though the society is there to be strip-mined.

I really like this quote. We simultaneously value trust and community, yet so many people also treat it as just another resource to turn into money and power. Alan Kay is a real gem.

mempko · on March 19, 2024

I see a lot of people here are missing his "big deal" which he talks about in the end where he references the "Spaceship Earth" problem.

What I believe he is getting at is people are going to use LLMs to build systems at scale to further strip mine society.

The "Spaceship Earth" problem is a reference to Limits to Growth. For those who haven't read "Limits to Growth", and the more recent Re-calibration of Limits to Growth, I implore you to do so.

https://onlinelibrary.wiley.com/doi/full/10.1111/jiec.13442

lucianbr · on March 19, 2024

Most tech seems to be used, at least amongst other things, to further strip mine society. It's the extreme of wealth accumulation, and there's always some people wanting to do that. Microsoft uses the idea of personal computers to get a monopoly on OSes, and get as much rent with that. Google and Amazon use the internet. Apple the iPhone with the walled garden. OPEC uses oil to extract as much wealth as possible. Toyota and VW would also get a monopoly and then raise prices it if they could. Why do you think VW bought so many other car makers? Boeing now that it is too big to fail, focuses on making more money, and less safety and excellence in engineering. Everyone does it. Well, every big business.

It's a bad thing for sure, but doesn't seem specific to LLMs, and a solution would not be specific either.

mempko · on March 19, 2024

It's not specific to LLMs, but the point is LLMs will make things more efficient and therefore make 'a bigger straw' to drink the milkshake.

semi-extrinsic · on March 19, 2024

Why will LLMs make things more efficient?

As I see it, LLMs have the potential to either reduce the amount of "bullshit jobs/tasks", reduce the amount of time programmers spend on boilerplate, etc.

But those inefficiencies are 95% human made, on purpose. Bullshit tasks are made up because middle managers want more underlings. Languages/frameworks with lots of boilerplate exist because companies would rather hire fifty average programmers rather than ten brilliant ones.

Even if LLMs have the potential to make these things more efficient, the people holding the money bags don't want the inefficiencies removed.

Consider influencers. What they post on social media could 100% be replaced by the outputs of LLMs and diffusion models. It would be vastly more efficient for companies to advertise their products by creating text and images about a pretty person using their products in an exotic location, than to pay for airplane tickets and hotels and salary for an influencer. Yet we don't see even a hint of the "influencer revenue crisis" that this would cause.

fzzzy · on March 20, 2024

The people holding the money bags who don't want the inefficiencies removed, will have their money bags removed by people who do.

Not instantly and not perfectly though. Inefficiency can be very resistant to change as you are pointing out.

mempko · on March 20, 2024

You should read Bullshit Jobs by David Graeber. Goes over why capitalism creates all these bullshit jobs despite theory saying otherwise. Have you considered that LLMs may increase the efficiency of bullshitization?

lucianbr · on March 20, 2024

Is that something that didn't apply to the internet when it was invented? This is just another way of saying "LLMs are an unprecedented technological leap", which is repeated ad nauseam these days, and personally I a not convinced. Again, this is orthogonal to the "strip mine society" concern.

titzer · on March 19, 2024

LLMs seem like a new and dangerous tool to scale the bullshit-generation machine to unimaginable levels. Pretty soon it will be impossible to trust any digital information, as every image, recording, video, document, etc can be instantly altered, adjusted, or completely fabricated by ungodly powerful generative AIs.

lucianbr · on March 20, 2024

Must every other comment be "LLMs will change everything"? No, we will not "soon have ungodly powerful generative AIs". At least, it's not likely.

When we landed on the moon, people thought we would soon have Mars bases or even colonize other solar systems, yet here we are, 60 years later. It's naive to think every technology goes strictly on an exponential upward curve. Reality isn't that simple or repetitive, really.

And, no, it's not "different this time". Every time people think it is different this time, and it never is. We don't have flying cars, hypersonic jets, molecular assemblers, AGI. We won't have them tomorrow either, just because Sam Altman says so.

titzer · on March 20, 2024

Bah, the only prediction I made is more bullshit on the Internet and I 100% standby this prediction.

LightFog · on March 19, 2024

This stood out to me too - the technology helping us move toward a ‘greed singularity’.

dimal · on March 19, 2024

Off topic, but I didn't realize Alan Kay was regularly answering questions on Quora. I can ask "What does Alan Kay think about X?" and get an answer from Alan Kay?!?

YeGoblynQueenne · on March 19, 2024

Well, you can try. And if he doesn't answer you can always ask an LLM :|

stana · on March 19, 2024

Can we trust it though that is what Alan Kay would say?

swyx · on March 19, 2024

when you're alan kay you can literally just title things in the third person and it'd get read haha

dpflan · on March 19, 2024

There is a lot in here, various paths to venture off, but the bottom line seems to be trust is important when running commands on a machine, and LLMs are not trustable. What else?

altruios · on March 19, 2024

what would be required for trusting an LLM?

1: 100% transparency. Open Source code, fully (and correctly) attributed training data.

2: A predictable model of what these models are actually encoding (so that hypothetical new models (or modifications) can be reasoned about).

falsaberN1 · on March 19, 2024

Transparency won't help a lot from a technical standpoint (seems more like a solution to a legal issue than a technical one). I can't trust LLMs because they just...recombine text by probabilities and aren't deterministic. I get incorrect information every time I ask them a thing, and it's incorrect in different ways every time. The only things they seem to get consistently correct are very widespread facts that are much faster and trivial to google.

I still find it funny we managed to get image generation working so much better than text.

chongli · on March 19, 2024

I still find it funny we managed to get image generation working so much better than text.

If you care about veracity then image generation works about as well as text. Frequently you can find details of the image that are just bizarrely wrong, such as hands or food or other basic things. It's the same basic problem: there's no intelligence behind what it's doing, it just regurgitates mostly realistic-seeming pixels that are pretty good at fooling the casual viewer.

Really, it's like those moths with eyespots on them: good at fooling the brain's heuristics but obviously not real.

falsaberN1 · on March 20, 2024

Try looking at things from my angle: A few errors in an image can be not much of a big deal (with modern tools, the mistakes are within human margin of error on average), but errors in delivery of textual data such as facts, dates or code can be far more severe and subtle. There are ways to work around or reduce the shortcomings of image generation, and the quirks you mention have drop-in solutions for local installs, but you can't quite fix wrong facts in text automatically, or the context going off-rails. It can be much harder to catch than a hand missing a finger, too.

It's also worth mentioning you can run a heavily customized Stable Diffusion setup at home with fairly modest hardware with satisfactory results if you know what you are doing, but anything you can run at home for LLMs in the same hardware is dog slow and actually kind of terrible.

gamacodre · on March 20, 2024

I think this is still just a difference in how the output is used. You're presenting text generation as factual and image generation as artistic. It could be reversed - no one will care if a fantasy story gets some in-milieu "facts" wrong, but a blueprint or architectural reference coming out of Stable Diffusion could ruin someone's year.

ToucanLoucan · on March 19, 2024

So much of the odd almost cultish community around LLMs seems to just be people who really want to be at the ground floor of the Next Big Thing who are so wildly biased into this being that next big thing that they will spend all their time, all their energy, not just on other people but on themselves, convincing themselves over and over that their LLM girlfriend really does love them, that their LLM assistant is going to be the next DaVinci, that their generated art is so much better than anything else.

More than anything it makes me sad. ANY amount of critical thinking would tell you all of this is not true, which isn't to say there's NO USE AT ALL for this technology, it certainly exists and has it's applications, and I also do more or less believe someday we'll create digital intelligence, but at the same time... ChatGPT is not that. DALL-E is not that. These systems are interesting and they have uses but they are not emergent intelligence, they don't know anything, they just assemble words from massive probability matrices and then the people who read those words ascribe meaning to them that is far, far beyond what originated them.

In this way it's not so dissimilar from any garden variety religion, it's just religion for people who think they're too smart to fall into the trap of motivated reasoning and magical thinking.

RodgerTheGreat · on March 19, 2024

I think it's under-appreciated how much LLMs harvest the natural, human tendency to generously ascribe meaning, subtext, and intent to text they read, glossing over flaws and small mistakes so long as the overall "thrust" seems reasonable enough.

In a sense, LLMs have reinvented cold-reading from first-principles and created the cleverest Hans of them all.

adamrezich · on March 19, 2024

We really should have figured out anthropomorphization and how to overcome it before we got to the "AI age".

RodgerTheGreat · on March 19, 2024

One of the worst-case scenarios I can presently imagine for LLMs is a world so drowned in semi-plausible nonsense that new generations teach themselves to simply ignore the printed word, subconsciously categorizing it as noise, and in so doing essentially un-invent literacy.

fzzzy · on March 20, 2024

Literacy isn't required if a computer can read any text. That is a huge amount of the brain that could be applied to something else.

ToucanLoucan · on March 20, 2024

I genuinely haven't in some time read an opinion so horrifying as this. Well done.

RodgerTheGreat · on March 20, 2024

Do you think it would have the same psychological impact if you had to ask your cell phone to read it to you?

ToucanLoucan · on March 21, 2024

If nothing else the irony would be painful.

wizzwizz4 · on March 19, 2024

Then, maybe, we can re-invent libraries. (https://xkcd.com/1909/)

pixl97 · on March 19, 2024

Heh, I don't think you've had enough philosophy under your belt to realize how incredibly far we are from figuring that out...

Or to say, "It's Plato's cave all the way down"

Me1000 · on March 19, 2024

You seem to be making a point in good faith so I would like to give you a slightly different perspective.

I'm not entirely sure what you mean by "cultish community", from my perspective there are a few distinct communities around LLMs, all focusing on different aspects, all excited about different things.

One common theme across all the groups though is that they used an LLM for the first time and their mind ran wild with the possibilities. That first moment when the LLM does something better than you expected, or even completely unexpected. I think most people understand that their imagination might be overactive in that moment. But it's a rare feeling to be surprised by a new technology (at least for me) these days.

On the other end, we have social media platforms where being a pessimistic curmudgeon ends up getting the likes and shares. And it's just easier to be a pessimistic curmudgeon; the vast majority of ideas never work as well in the real world as they do in your head. I'm just as guilty of this as anyone else. But the real problem is that it puts us into tribes. As someone who is very excited about what LLMs are going to bring to our futures, when I see someone post on Mastodon or HN, or wherever, I become defensive and my monkey brain feels the urge to push back. In particular because I think the criticisms generally voiced are not well reasoned or thought out. Your own post has a tone of dismissal, painting a lot of people, all of whom excited about different things, as a cult who is obsessed with their LLM girlfriend. I would agree that anyone today trying to draw some deeper meaning from the outputs of these systems are probably worthy of dismissal, but I don't think that's the vast vast majority of people who are excited about LLMs. And it makes _me_ sad that the extremists are the ones that get to suck all the oxygen out of the conversation.

We're in the beginning days of this new technology. LLMs are good at doing things traditional software isn't, and bad at doing a lot of things computers are traditionally good at. Natural language answer engines and sex bots might have been some of the first obvious applications of LLMs, but I'm willing to bet there are a lot more undiscovered use cases out there. Simon Willison has some great advice, which is for newcomers to try to break the LLM as quickly as they can, get it to lie to you, or do something wrong. Test its limits. That's part of the process! We're going to need some time to figure it all out and make these systems work well for us. I'm a technologist, and exploring this technology is exciting.

ToucanLoucan · on March 19, 2024

I mean I think the biggest problem with the LLM community is that a sizable portion of not a majority of it consists of the exact same opportunistic "ground floor" people as just jumped off of cryptocurrency as the "thing they refuse to shut up about" which is all well and good when it's an exciting new thing that might change the world, but gets notably more insufferable when it's obviously just the newest route that person sees to getting greater amounts of money and/or social media clout, and they clearly do not know anything about it beyond it's potential to do those things. The colloquial term for that, I think, being "grifter." And to be clear, LLMs are not unique in their ability to draw in those types of insufferable people: see the aforementioned comment about crypto, it was the last big one, and before that probably dropshipping. But I digress.

And I bring all that up to say: no, the vast majority of people, I don't think expect the computer to come to life and tell them it loves them. I think the vast majority, in fact, don't know a fucking thing about LLMs beyond maybe the rote copy/pasted code it takes to bring one into existence, or if we're being honest, more likely, the websites to put their credit card information into to gain access to one for their use, and that shift in base assumptions I think explains why they speak so incoherently: they do not understand it in any depth, and thereofre, they might think the computer will come to life, because they don't know much about computers in general and to a layman, what an LLM does can indeed look like a vague imitation of life.

Like, I mean this in the nicest way possible though just by virtue of what I'm going to say, it is going to sound mean, but: tons of the really big pro-AI hype people just, clearly, bluntly, full disclosure, do not know shit about LLM. Quite a large slice of that pie also don't know shit about technology in general, or seemingly, much of anything beyond a business degree? But irrespective of that, to the wider world who aren't in this and don't participate in the groups at hand... those are your representatives, by default. The attention economy has produced them and you have my most sincere sympathies for that.

floren · on March 19, 2024

"You must have been using GPT 3.5..."

brookst · on March 19, 2024

s/LLMs/people

thesz · on March 19, 2024

I think I have to know what chain of reasoning is behind this or that fact and/or deduction. i would like to be able to verify that.

For example, the proof of the absence of solution in SAT should be accompanied with the easily verifiable chain of reasoning. This shows the absence of incorrect deductions and missing assignments. Another example is autovectorization in contemporary compilers, they can show you why parts of your loops are not eligible for vectoriztion.

All LM's can do is to show me that these parts of those inputs are important for that output, but nothing else. Thus, they cannot be trusted even for minimally critical tasks.

pixl97 · on March 19, 2024

Even you can't do that yourself. You can only mad post ad-hoc justifications for the choices you made.

thesz · on March 19, 2024

To illustrate: https://arxiv.org/abs/2401.05566

One can teach LLM to write good code now and bad code in future (when everyone lower their guards). And no one can prove who and how made LLM to do that. Also, no one can prove formally the absence of such a plant.

thesz · on March 19, 2024

What cannot I do myself? Look at the output of gcc -ftree-vectorizer-verbose=2? Run the verifier after picosat?

I did both.

utopcell · on March 19, 2024

Open sourcing would only guard against human programmers. The fact of the matter is, we don't _really_ know why LLMs work.

markrages · on March 19, 2024

Even with these in place, the output of the LLM is not trustworthy. In the sense that it doesn't care what it true, only what is plausible.

tjr · on March 19, 2024

I would add, repeatable / reproducible results. Given the same input, you get the same output every time. If the input includes some sort of random seed for the express purpose of getting different output intentionally, then so be it, but I can't trust a program that I never know what it will do in response to what I tell it.

fzzzy · on March 20, 2024

Isn't the gist that you should be able to trust your teacher?

keybored · on March 19, 2024

What I like about programming (real programming) is that it is dumb and obtuse. You can see why things happen. Because the instruction languages are painfully literal. They will throw their hands up if you omit a step. Or crash.

You can understand it.

That’s why I dread the AI future.

svieira · on March 19, 2024

> By “help” I mean that — especially when changes in epistemological points of view from one’s own common sense are required , it can make a huge difference to be near a “special human” whose personality is strong enough to make us rethink what we think we know.

This is how I measure Ed-Tech companies. Do they have an awareness that you cannot replace the connection with other human beings that is an essential part of teaching with "facts" or not? If "yes, they have that awareness" how do they mitigate the problem?

ianbicking · on March 19, 2024

This feels like a limited and perhaps naive perspective on LLMs. If you looked at computers as adding machines in the 60s/70s then you'd be missing most of what was interesting about computers. And if you look at LLMs as a question answering service now, you are also missing a lot.

It's hard to compare trust of LLMs to other computing, because many of the things that LLMs get wrong and right were previously intractable. You could ask a search engine, but it's certainly no more trustworthy than an LLM, gameable in its own way. The closest might be a knowledge graph or database, which can formally represent some portion of what an LLM represents.

To be fair the relational systems can and will give "no answer" when an LLM (like a search engine) always gives some answer. Certainly an issue!

But this is all in the realm of coming up with answers in a closed system, hardly the only way LLMs can be used. LLMs can also come up with questions, for instance creating queries for a database. Are these trustworthy? Not entirely, but the closest alternative supportive tool is perhaps some query builder...? I have seen expert humans come up with untrustworthy queries as well... misinterpretation of data is easy and common.

That's just one example of how an LLM can be used. If you use an LLM for something that you can directly compare to a non-LLM system, such as speech recognition or intent parsing, it's clear that the LLM is more trustworthy. It can and does do real error correction! That is, you can get higher quality data out of an LLM than you put in. This is not unheard of in computing, but it is uncommon. Internet networking, which Kay refers to, might be an analog... creating reliable connections on top of unreliable connections.

What we don't have right now is systematic approaches to computing with LLMs.

alexvoda · on March 19, 2024

There was a time when a search engine would return no results if it didn't find anything instead of returning increasingly irrelevant results.

It was a conscious decision by corporations to implement the dropping of constraints and search terms when too few results would have been returned. Today the search operators are a joke.

pixl97 · on March 19, 2024

“All models are wrong, some are useful.”

What's kind of funny is almost all the people that complain about accuracy of LLMs would gladly answer the question of "What's the chance for rain today" without giving giving you a 5 minute lecture on what forecasts actually mean.

soperj · on March 19, 2024

"I'm unable to provide real-time weather updates as my data is not current. To find out the chances of rain today, you can check your local weather forecast through a reliable weather website or app, or tune in to a local news station for the most up-to-date information." - chatgtp

brookst · on March 19, 2024

"Seattle is known for its rainy weather, especially during the spring months. According to historical data, March typically sees an average of 17 rainy days in Seattle. Given that we're in the middle of March, there's a fairly high likelihood of rain on any given day.

Without access to real-time weather data, I'd estimate that there's approximately a 50-60% chance of rain in Seattle on March 20th, based on historical averages. However, for the most accurate forecast, I'd recommend checking a reliable weather website or app closer to the date." -- chatgpt

(prompted with the current date, location, an admonition to estimate, and a promise that I understand it will likely be wrong)

andrewflnr · on March 19, 2024

Yeah because most everyone you talk to knows how to treat weather forecasts, but LLMs are still widely misunderstood. The right idea seems to be getting out, but in no small part due to the "complaining" you mock.

r00fus · on March 19, 2024

> If you use an LLM for something that you can directly compare to a non-LLM system, such as speech recognition or intent parsing, it's clear that the LLM is more trustworthy

Except for the hallucination problem, sure. How do you ensure the answer you get from an LLM is not a hallucination? It sure doesn't.

pixl97 · on March 19, 2024

How do you ensure any bits of data? You going to fist fight your uncle at the next family reunion when he promises you pigs can actually fly?

There are plenty of bits of data that can be hard references to datasets or actual physics to assign their probability value. And there are other things that can't.

r00fus · on March 19, 2024

Completely inefficient to chain logic through LLMs if they aren't baseline reliable.

ianbicking · on March 19, 2024

"How do you ensure the answer you get from an LLM is not a hallucination?"

Intent parsing and speech recognition both frequently return inaccurate responses. It is true that with an LLM it can return something that is sensible but invented, where other systems typically return less inventive wrong answers.

With any of these systems you want to balance the probability of an incorrect interpretation against the impact of the action being taken, and get confirmation based on that. That's pretty normal engineering and UX.

LAC-Tech · on March 19, 2024

Typical Alan Kay. Very long, very thought provoking, full of historical references, and barely answered the question at all. :)

r00fus · on March 19, 2024

He did give a TL;DR about halfway through his answer: "Finally, to try to answer the question … (Summary: I don’t think it would be a good idea at all)"

aaroninsf · on March 19, 2024

Always a welcome and thoughtful opinion, but not correct about the potential for LLM. Waving away their output as "BS" is awfully flip, and IMO is hence one more example of Ximm's Law (that every critique of AI assumes to some degree that contemporary implementations will not, or cannot, be improved upon.)

One might find ambiguity in his criticism though, that LLM alone are insufficient... but, that's what Ximm's Law is saying. It's not very interesting to (as I would say, he does here) take on straw, rather than steel.

(A steely defense of LLM is to say that no one is particularly interested in scaling LLM without other improvements, though scaling alone provides improvements; multi-modal, multi-language, long-context, and most of all augmented systems which integrate LLM into systems rather than making them "systems on a chop", are where things look and IMO will be interesting.)

PCMPSTR · on March 19, 2024

This part of his answer:

> A key part of their design was to not allow direct sending of commands — only bits could be sent. This means that (other) software inside each physical computer has the responsibility to interpret the bits, and the power to do (or not do) some action

seems, at least on a basic reading, to contradict this famous little argument (or maybe trolling?) he had on HN with Rich Hickey where he seems to be suggesting that one shouldn't just send raw bits, but also a little interpreter along with the data: https://news.ycombinator.com/item?id=11945722

Maybe this is an inevitable consequence of always speaking so abstractly/vaguely, but it also makes it difficult to know what exactly he's suggesting the industry, that he is so routinely critical of, should concretely do next.

discreteevent · on March 19, 2024

The main point he was making in that argument is that data is nothing without an interpreter. Sending an interpreter with the days is secondary. Rich hickey was saying that data is meaningful on its own. But that makes no sense.

amelius · on March 19, 2024

LLMs are horribly broken.

But we somehow like the idea of putting everything behind an API and then call it a solved problem.

sebastianconcpt · on March 19, 2024

It is by definition that the silicon automatons cannot escape from their ontologically stochastic imitative hallucinations.

All they can do is be observed by us and with that expose us to get induced by association, ontologically human hallucinations and deal with its outcome of real consequences.

sebastianconcpt · on March 19, 2024

Questioning trustfulness to ontological hallucinatory automatons. Nailed it!

better_sh · on March 19, 2024

but what does Ja Rule think?