Geez people. Please learn to identify hypes, but most importantly the causes of hypes.
LLM is fundamentally information processing technology. Not the path to AGI, not an emerging sentience.
The reason "this time" feels so amazing is that the unwashed masses suddenly got access to new information processing technology in a context where their tools been stagnant for decades.
Not because it was not possible, but because there was no money in it.
To understand my argument and its implications, humor me for a moment and imagine a universe in which everyone was already using linux computers and for a decade now published papers of ML/DL were available for people to use. So there were various crowdsourced indices and models of all sorts, which people incrementally embedded in their information processing workflows.
In that universe there would be no room for the delirious reaction we have here. It would be incremental evolution of search, knowledge bases, algorithmic content generation etc.
What we have experienced instead in these past decades is information tool starvation. These incrementally improving tools, while nothing but known and ultimately mundane algorithms were not available except within a tiny elite.
In fact, people's information processing capability arguably declined as the desktop platforms got downgraded, adtech toxic waste covered the information landscape etc.
What is happening now is that a socioeconomic and artificially induced scarcity is now being broken (for reasons that require serious piecing together of events).
So while on the surface this hype is as distastful as any illustration of
human lemming behaviour, there is enormous silver lining if we succeed to read its causes.
These tools are here, have been here for a while and they can be inserted into our ever growing information processing toolkit. The risks are there to match the opportunities.
The biggest risk of them all is precisely what has led to the current situation. Technology not diffusing normally, but controlled by gatekeepers.
We have spent the last seventy or more years working on machine translation.
All the way back in the 1950s, big money went into it. Decades of hard work, mathematical models of human language, all manner of study, enormous bilingual corpuses of text with phonetic annotation, programmed in general-knowledge databases, fuzzy reasoning algorithms. The amount of work put into it is quite staggering, in hindsight. I remember the cutting edge in the 1990s - SYSTRAN for example, could with some significant human guidance and a limited context domain, translate technical material sometimes usefully.
All of that work has been rendered moot by deep learning. All of it. A machine can, simply with the correct deep learning algorithm and mass exposure to language plus a few bilingual texts, learn an algorithm for translation. It does so automatically, no verb conjugation algorithms, no general knowledge databases, no expert systems with fuzzy reasoning, no parsers, not like a specifically-designed old-school translator had.
And yet these deep-learning systems are vastly superior to the old school architectures, completely supplanting them a couple years after their development.
It is the same story in many other areas. Chess, Go? They learn to play chess and go better than any AI designed specifically to do so. Image classification? Better than the previous 60 years of work on machine vision, and again, accidentally falls out of it. Speech recognition? An algorithm to write a bad poem? Well, we now have an algorithm to find an algorithm to write you that bad poem, if you want it.
That's the thing. These are algorithms to solve very tricky problems, and we didn't have to discover, find, or otherwise create the algorithm. The machine did it for us. I am not sure I'm communicating it well, but to me that's probably the most significant advance since the computer. It was understood - theoretically - that this was possible for a long time; but personally at least, I assumed it would forever require more data and compute than could be realized.
> It is the same story in many other areas. Chess, Go? They learn to play chess and go better than any AI designed specifically to do so.
Take GP's "stagnant for decades" with tech and turn it into "stagnant for centuries" with these. When it first started I remember professional Go players talking about just how big a shakeup it was.
Even if the core difference is accessibility, that is huge. I see AI in 2023 as a pivotal moment similar to 1977 with personal computers, 1994 for the internet and 2007 for mobile computing. All of these technologies had a history before their pivotal moments but when the time was right things changed fast.
Having easy access to the technology is starting to change the way I think about what is possible for computing. One of the skills that has helped me though my career is understanding the scope of what a computer can do or automate. Most of the time it isn't that the thing cannot be done but that the cost associated with it is too high. I recently started solving an issue that would have taken hundreds of hours of human intervention but now I can do it with ten bucks and two hundred lines of python.
2023 is the year where I personally benefit from LLMs, and might spend money on them as casually as others spend on video streaming, but the pivotal moment could still be a year or few away.
Your post boils down to “This is the next iPhone moment”.
The tech existed but was clunky and unavailable. Now it’s good and accessible enough to be used by most people. This is how technological revolutions happen!
Edison didn’t invent the lightbulb. But he made it useful and invented power distribution. That’s the important part. Consumers don’t care about technology that exists in a lab that they cannot even begin to fathom how to use.
Mind you last year even experts didn’t think LLMs would be able to do what they’re doing right now for at least another 5 to 6 years. This also is in part a side-effect of the hype and extra attention. All this attention causes an acceleration.
> LLM is fundamentally information processing technology. Not the path to AGI, not an emerging sentience.
This logic doesn't follow - AGI will be, fundamentally, information processing technology, just as the human brain is, fundamentally, information processing technology.
> "Not the path to AGI, not an emerging sentience".
These are HUGE claims. Define AGI. Define sentience. What is the path and why isn't LLM on it?
Right now those words are only dangerous misinformation - originating from a brain trying to protect its shaking world view. A reminder: you can talk with computers now. Think about that for a moment.
LLM are mathematical models that, given your question, return a sequence of words based on a very complex probability model. The many examples on the internet show that ChatGPT doesn't understand most concepts. For instance, it doesn't know what a book is; it just knows what humans say about books.
Finally, OpenAI has already stated there is a limit to how far the current AI models can progress. ChatGPT 5 will eventually be released but will be hitting the limits. A paradigm shift is needed to move to the next phase of AI.
There is no point in arguing whether LLMs have minds or are conscious, and neither is arguing whether it actually understands any concepts. Neither are LLMs are the first tech to be useful to humans - caculators are invented much earlier.
What is interesting is that it’s answering questions in a meaningful way that is competitive to humans. It replaces about 50% of jobs for each field where all work can be done on computers (like drawing and writing texts) and that number will only grow, until the point where only the most competent people with lots of training and experience can be trusted to do better.
And we don’t need GPT-5 to hit the likit - I think OpenAI said that they don’t have GPT-5 because GPT-4 already hit the limits. The next area for improvement would be multimodal, I think.
The type writer, or the PC didn't replace the need for secretaries. PAs just have more responsibilities beyond typing letters.
As translation is now cheaper, as it can be part automated, there is more demand for translation, whereas this was previously too expensive.
The problem with ChatGPT is that you need to specify exactly what you want. Writing text and drawing is often a creative process where you will only know at the end what you want.
We are true and well within the hype curve at the moment.
> LLM are mathematical models that, given your question, return a sequence of words based on a very complex probability model.
How is this different from how humans respond? Our model could be just some orders of magnitude more complex. Or do you think there some _fundamentally_ different things are going on in the human brain?
I think that's moot: based on the universal approximation theorem [1], a big enough model is indistinguishable from the human brain, regardless of whether the mechanism of action is fundamentally the same or not. I believe this applies to anything that can somehow be modeled with a continuous function - whether that's possible for the human brain is an open question, though we only need a certain fidelity to be useful.
The more useful question is: can the token prediction model scale to the level of a human intelligence within a reasonable power budget compared to a brain? It's comparing apples to oranges right now but the human brain consumes under 20 watts, a tiny fraction of the TDP of a single A100 GP, and the state of the art isn't even close in performance. We've got a long way to go before we can conclusively answer these questions.
This is not the unchallenged, consensus position, though. A competing position within cognitive science is that intelligence requires embodiment, perception, metacognition, curiosity, etc., and that these factors that allow for the emergence of intelligence are indispensable, more or less.
I won't claim to know which is correct, or even if some other alternative is correct; however, this is not settled at all.
I do think it will some day be possible to simulate all of the embodied cognition above, which may truly render this discussion moot, but that LLMs are not doing that at all.
Seriously. How does a bag of random particles have those things?
It cannot, by our own definition.
Hence the metaphysical problem of 'consciousness' as it relates to our variation of scientific materialism.
I suggest the pragmatic approach is along the lines of what the OP said aka 'sufficiently large neural net will be indistinguishable from human' and that's it. We will see things that we can de facto contemplate as 'curiosity' 'perception' 'meta-cognition' if we want to, especially if we start to develop a more meta understanding of these systems, or not, and that's it.
We'll probably be arguing about 'cognition' long, long after we have variation of AI that kind of seem to be AGI. By many measures we are already kind of there Chat GPT will fool humans probably most of the time and that's that.
You know what a book is and can reason about it. As a human you can provide answers that go beyond the knowledge you have ingested.
For example, if you ask ChatGPT if books can be fluorescent it says no. However, as an adult you know someone somewhere has made a book with fluorescent images, as it is a cool thing. You are combining knowledge from two different fields (books + fluorescence) and establishing the likelihood of someone being able to combine them.
GPT-4:
While the term "fluorescent" is typically used to describe substances that can absorb light and then re-emit it, often at a different wavelength, there's no inherent reason why a book couldn't be made with fluorescent properties.
This could be achieved by using fluorescent ink, dyes, or paints on the cover or the pages, or by incorporating fluorescent fibers into the paper itself. When exposed to ultraviolet light (often called "black light"), these materials would glow. This might be used for aesthetic reasons, for practical reasons like aiding reading in low-light conditions, or for interactive elements in children's books or art books.
It's important to note that this is not a common feature for most books, as it would increase production costs and might not be appealing or necessary for all readers or all types of books. Also, long-term exposure to ultraviolet light can be damaging to many types of paper and ink, which could reduce the lifespan of the book.
As of my knowledge cutoff in September 2021, I don't have any specific examples of fluorescent books, but that doesn't mean they don't exist or couldn't be created.
This exactly illustrates the problem of ChatGPT. I asked the same question, phrased slightly differently, and it said no. It also not unlikely my question was used to train CHATGPT further.
Does it, really? I’m pretty sure if you’d ask some people, there are going to be some that’d answer „no“ too.
Besides, you’re shifting the goalposts as you go, altering your arguments as it suits your view. The parent comment just disproved your entire point how LLMs cannot combine different fields - and now you just pick a different angle altogether. Maybe take a step back and reconsider your opinions?
Yet, this doesn't answer my question. Human brain is obviously has so much more in it (visual cortex for starters, grid cells etc), and, in terms of neural network, much more sophisticated architecture. But still, there is a big probability that what we call "knowledge" and "reasoning" and "conciousness" are just a result of this sophisticated architecture. I.e. there is no special magical thing for "reasoning" that next generations of prediction models can't replicate.
There is a faboulous book by Jeff Hawkins "On Intelligence" (2004) that explores this. I think main premise of it still holds true: brain is "just" a highly sophisticated hierarchical tissue whose main job is to extract patterns and make predictions. Fundamentally it doesn't seem very different from what LLMs are.
> LLM are mathematical models that, given your question, return a sequence of words based on a very complex probability model.
I don't think that this is true. LLMs can be modeled with mathematical objects, yes. Functions, sets, numbers, relations, all are very powerful, and can model anything. They can model LLMs, as they can model 'you' for example. Are you a function from the states of the universe to itself?
Why not? I believe we are this close: >< to an AGI, and chatGPT will turn into an AGI soon.
It probably should rely on its neurons less, and use symbolic reasoning more. That, and learn and grow, either maintain a fact database, or retrain its neurons or add more neurons for new facts.
The problem with calling it hype is the broad stroke that's painted about all discussions of its abilities, simply by using that word when it's not all hype. A key part of the definition of hype is the exaggeration of importance and benefits. There are absolutely some out there out there that are exaggerating ChatGPT's abilities, but what are those exaggerations and what are actual abilities that anybody with an OpenAI account can go a verify (true or not) for themselves?
It's an exaggeration to say that GPT-4 can solve all programming problems, but it's easily verifiable that, for example, it's able to solve easy level leetcode problems, and what's impressive on top of that it's able to give an answer to those problems practically before I've finished even reading the problem. That's not hype or a claim by me that you have to take on faith, anybody can copy and paste a problem description and get an answer. It's fair to point out that it's so-so at medium problems, and downright unable to do hard problems, but, well, most humans can't solve hard level problems.
So what's hype, and what's not? It seems to think/reason/understand better than a human child. It's better than some humans with its command of the English language. It has broad knowledge of many subjects beyond what most individual humans know about. It sometimes has poor recall of that knowledge, and some of it is wrong, but it accepts corrections (even if it's unable to commit those corrections back to long-term memory) with far more grace than most humans.
LLM is fundamentally information processing technology. Not the path to AGI, not an emerging sentience.
The reason "this time" feels so amazing is that the unwashed masses suddenly got access to new information processing technology in a context where their tools been stagnant for decades.
Not because it was not possible, but because there was no money in it.
To understand my argument and its implications, humor me for a moment and imagine a universe in which everyone was already using linux computers and for a decade now published papers of ML/DL were available for people to use. So there were various crowdsourced indices and models of all sorts, which people incrementally embedded in their information processing workflows.
In that universe there would be no room for the delirious reaction we have here. It would be incremental evolution of search, knowledge bases, algorithmic content generation etc.
What we have experienced instead in these past decades is information tool starvation. These incrementally improving tools, while nothing but known and ultimately mundane algorithms were not available except within a tiny elite.
In fact, people's information processing capability arguably declined as the desktop platforms got downgraded, adtech toxic waste covered the information landscape etc.
What is happening now is that a socioeconomic and artificially induced scarcity is now being broken (for reasons that require serious piecing together of events).
So while on the surface this hype is as distastful as any illustration of human lemming behaviour, there is enormous silver lining if we succeed to read its causes.
These tools are here, have been here for a while and they can be inserted into our ever growing information processing toolkit. The risks are there to match the opportunities.
The biggest risk of them all is precisely what has led to the current situation. Technology not diffusing normally, but controlled by gatekeepers.