The simplest answer, respecting the article's own assumption is: They are giants of AI and they are not getting it wrong.
An immediate corrolary of which is: the article's author is the one getting it wrong. Eyballing the attempts to show that the "giants of AI" are wrong by poking GPT4 I'd say there's a good chance of that.
An even simpler answer, violating the assumption is: they're not giants of AI and that's why they're getting it wrong. Or is that a simpler answer? Well, now we have to explain how a guy with a machine learning PhD writing on Substack is getting it right, when LeCun, Brooks, and so on are getting it wrong.
Chomsky is not an AI expert, so there's that: the author doesn't quite know who he's talking about.
Strongly disagree. I've seen towering giants in their fields that still can develop a "mental block" if you will if their developed notions about how their technology/science should work doesn't pan out. E.g. why you often see scientific breakthroughs among younger scientists (and, yes, while I realize the "average age at Nobel breakthrough" has been rising over the past century, it still skews relatively quite young). For example, Brooks in particular spent his career around the idea that intelligence requires "real world interaction", so it's not surprising to see him pooh pooh the idea that something can become intelligent that lacks that experience.
I would be really interested in examples where Lecun, Brooks, Chomsky, or any others have argued their points in relation to some specific examples like the kinds that are pointed out in this article. Fully admit that I don't have a full catalog of their Twitter posts and interviews, so I genuinely would love to see this. I remain highly skeptical, though, when I only see them argue in generalities, e.g. "LLMs can't be smart because they don't have a model of explanations of how things occur." I mean, fine, but then how do they explain some of the quite frankly amazing capabilities of some of these systems if they're "just doing autocomplete." For example, saw an absolutely mind-blowing example where someone taught ChatGPT an entirely new made up language, and ChatGPT was able to guess at, for example, the appropriate declensions for different words. Would love to see Chomsky's reaction to that one.
> For example, saw an absolutely mind-blowing example where someone taught ChatGPT an entirely new made up language, and ChatGPT was able to guess at, for example, the appropriate declensions for different words. Would love to see Chomsky's reaction to that one.
Chomsky is one the major experts if the XX century in language structure, as well as a founder of cognitive science. I'd say he is capable of having a fairly good idea of what the language models may and may not be doing, even if he doesn't know how the computer is processing the statistics needed to do it.
The man has his points, and his takedown of behaviorism is still excellent, but no one has found any trace of universal grammar. And while GPTs have a crude world model at best, you can't fault their understanding of syntax, developed without any prior model at all. On this matter I would not have too much faith in Chomsky's analysis. And he is regardless just not a "big name in AI"! He never even claimed to be!
Universal grammar in the way that Chomsky suggested, a sort of structure all humans must have in there brain, is not a thing. But I believe there is something like the "universality of grammar" where the human vision system or motor system can have a grammatical quality to them. In that sense that there is ambiguity in which tree structure represent the world. I've seen this in many places but I am not sure if someone has put this into (better) writing.
Universal grammar in the way that Chomsky suggested, a sort of structure all humans must have in there brain, is a thing. But it's more of a meta-grammar than a grammar, i.e. a way to build the grammar of the particular language the child is exposed to. Because all languages' grammars have similarities, including those of sign languages.
So sayeth the Chomskyites. But the idea has been less and less broadly accepted over time. Some of my professors were still avowed fans, but most were willing to endorse at best extremely weakened versions.
Now UG, after doing vast cross-comparison of every language, is a tiny framework of seemingly random grammatical facts. It's not nothing! But we would expect some commonalities to hold regardless, just by chance. Considering how many times we've had to trim our model, how confident can we be that now there's actually a baseline. We're out of languages to test it on!
And there is other reason to doubt. We only have 20,000 some genes. Most of them code for structural stuff in the body, essential proteins and such. Explanations that require things to be hardwired into the brain ought to be accordingly penalized. Even stripped down to its bare bones, UG is quite a bit of information!
And why exactly would evolution hardcode this choose your own grammar into the brain anyway? Until recently we could suppose that language would be difficult or impossible to figure out without it. But big dumb machine learning models trained on nothing but prediction worked out syntax just fine! Why would our brains, superb prediction engines of even larger size and with cross-modality cognition, be less capable?
Of course, humans learn language with far less input than GPT. But we learn everything with far less input than current models. And regardless of whether there is actually a universal grammar, we'd expect linguistic drift to tend towards structures which are readily understood by our brain- an advantage not shared by LLMs.
>> And why exactly would evolution hardcode this choose your own grammar into the brain anyway?
Shouldn't the question rather be phrased as "why exactly would evolution not get rid of grammar in the brain"? To my understanding, language ability developed randomly and was retained because of its benefits. I'm not an expert, but I have a feeling that's how evolution works: instead of developing the traits that an organism needs, traits are developed at random and the ones that turn out to be be useful are kept.
>> But big dumb machine learning models trained on nothing but prediction worked out syntax just fine!
Well, yes and no. Unlike human children, language models are trained specifically on text. Not just text, but text, carefully tokenised to maximise the ability of a neural net to learn to model it. What's more, Transformer neural net architectures trained to model text are specifically created to handle text and they're given an objective function specially designed for the text modelling task, to boot.
None of that is true for humans. As children, we have to distinguish language (not text!) from every other thing in our environment and we certainly don't have any magickal tokenisers, or Oracles selecting labelled examples and objective functions for us to optimise. We have to figure all of language, even the fact that such a thing exists at all, entirely on our own. So far, we haven't managed to do anything like that with any sort of machine (in the abstract sense) and certainly not with language models.
I think Arthur C Clarke has a pithy quote that challenges your argument from authority. "When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong."
To be clear, it's not me who's bringing up the authority of anyone, it's the article's author who does: he's the one calling people "giants of AI".
Which he clearly does to give his own article authority, that it wouldn't have otherwise. He wouldn't write a whole Substack piece to criticise something some random, anonymous user of FB, HN or Twitter wrote. He has to comment on the "giants of AI", otherwise his piece is basically irrelevant, just another voice adding more noise on top of the constant cacophony of opinions on the net.
The same way, when Geoff Hinton resigned from Google, a whole bunch of articles in the lay press (The Guardian, NYT, others I don't remember but scanned briefly) introduced him as "The Godfather of AI". A clear ploy to make the article interesting to people who have no idea who Geoff Hinton is, or what he's done for "AI" to be called its "godfather" *.
It would be much more straightforward for the article author to criticise the opinions he disagrees with by saying something like "Yann LeCun says so and so. I disagree because this and that". No need to say anyone is a giant of anything, to make a big, splashy impression to your reader. If your opinion has weight, it can make a splash all by itself. If it doesn't, tying it up to a "giant" will only make it sink faster.
HN has a similar guideline, in fact:
>> When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."
Basically, an appeal to authority is really the flip side of an ad-hominem: it tries to shift attention to the person, and away from what they said.
__________________
* It's because he, Bengio and LeCun are the Deep Learning Mafia.
He's not a writer, he's a professor in linguistics. He almost singlehandedly pushed the field towards a more technical analysis and formalisation of grammar suitable for automatic language parsing and translation.
If he isn't retired already, it's only out of personal choice, and not because he's desperate for a job.
The simplest answer, respecting the article's own assumption is: They are giants of AI and they are not getting it wrong.
An immediate corrolary of which is: the article's author is the one getting it wrong. Eyballing the attempts to show that the "giants of AI" are wrong by poking GPT4 I'd say there's a good chance of that.
An even simpler answer, violating the assumption is: they're not giants of AI and that's why they're getting it wrong. Or is that a simpler answer? Well, now we have to explain how a guy with a machine learning PhD writing on Substack is getting it right, when LeCun, Brooks, and so on are getting it wrong.
Chomsky is not an AI expert, so there's that: the author doesn't quite know who he's talking about.