Both Yann LeCun and Richard Sutton (author of "The Bitter Lesson" essay where he argued scaling of general purpose methods brings about significant returns) have already pointed out that LLMs are a dead end.
All that the AI Industry is doing is scaling computation/data in the hope that the result may encompass more of "existing real-world data" and thus give the illusion of thinking. You don't know whether the correct answer is due to reasoning or due to parroting of previously seen answer data. I always tell common folks to think of LLMs as very very large dictionaries eg. with the words from a pocket oxford dictionary you can construct only so many sentences whereas from a multi-volume set of large oxford dictionaries you can construct orders of magnitude more sentences and thus the probability of finding your specific answer sentence is much much higher. Now they can understand the scaling issue, realize its limits and why this approach can never lead to AGI.
Ti'm not at all pro-AI, but the argument were that LLM _alone_ were a dead end to reach AGI. I'm pretty sure we're generations away from AGI, but if we manage to build it, LLM would probably the tool it would use to communicate.
>If we manage to build it, LLM would probably the tool it would use to communicate
But that's the thing - LLMs are just a probabilistic playback of what people have written in the past. It's not actually useful for communication of new information or thought, should we ever find a way to synthesize those things with real AI. They're literally just a search engine of existing human knowledge.
It's a translation tool, and it's great at that? Basically it vectorize words in a multidimensional space (and share dimensions with similar words) if I understand correctly, so llms can 'distinguish' homonyms, find synonyms and antonyms easily, and translate extremely well.
All that the AI Industry is doing is scaling computation/data in the hope that the result may encompass more of "existing real-world data" and thus give the illusion of thinking. You don't know whether the correct answer is due to reasoning or due to parroting of previously seen answer data. I always tell common folks to think of LLMs as very very large dictionaries eg. with the words from a pocket oxford dictionary you can construct only so many sentences whereas from a multi-volume set of large oxford dictionaries you can construct orders of magnitude more sentences and thus the probability of finding your specific answer sentence is much much higher. Now they can understand the scaling issue, realize its limits and why this approach can never lead to AGI.