For instance, understanding language requires some of the capabilities of a SAT solver. This was something everybody believed in 1972, but today is denied.
Fundamentally "understanding" problems require the ability to consider multiple alternative interpretations of a situation, often choose one or work with the incomplete knowledge you have.
Back in the 1970s we had intellectually honest people like Richard Dreyfus writing books like "Things Computers Can't Do" that describe many specific ways the architecture at the time fall short. People on GPT-3 are working in a way that is academically valid (able to make results that are meaningful to a community) but from engineering it is like building a bridge with one end or a tall tower that carries no load.
GPT-3 has a structural mismatch with the domain it works in. Unlike early medical diagnosis systems like MYCIN, it is never a doctor, it just plays one on TV and it does the "passing for neurotypical" terrifyingly well.
The secret of GPT-3 is that people want to believe in it. Somebody will have it generate 100 text snippets and they will show you the three best. Your mind makes up meaning to fill up for its mindlessness. When this was going on with ELIZA in 1965 people quickly understood that ELIZA was hijacking our instinct to make meaning.
For some reason people don't seem to have that insight today, and it bothers me why that is. Back in the 1980s they had a lot of fear about compressing medical images because it could lead to a wrong diagnosis. Today you see articles in the press that are completely unquestioning that a neural network that has been trained to hallucinate healthy and cancerous tissues will always hallucinate the right thing when you are looking at a patient.
> People on GPT-3 are working in a way that is academically valid (able to make results that are meaningful to a community) but from engineering it is like building a bridge with one end or a tall tower that carries no load.
To me it seemed like the opposite. They are essentially working without any hypothesis of how their model actually works, without any model of the way it actually learns or the way it produces the results that it does, and instead placing blind trust in various metrics that are improving.
They are treating this as an engineering problem - how can we make the best human-sounding text generator - and not like a traditional research problem. GPT-3 has not taught us anything about anything except "how to generate text that seems human-like to humans". We have no firm definition of what that means, we have no idea of why it works, we have no idea of any systematic failures in its model, we know next to nothing about it, other than its results on some metrics.
Imagine the same applied to physics - if instead of inventing QM and Relativity or Mechanics, physicists got it in their head to try to feed raw data into a black box and see how well it predicts some observed movements.
In fact, this would be a pretty interesting experiment: how large would a deep learning model that could accurately predict what mechanics predicts get, given only raw data (object positions, velocities, masses, colors, surface roughness, shape, taste etc.)? Unfortunately, I don't think anyone has been interested in this type of experiment, because it is not useful from an engineering (or profit) perspective.
>Imagine the same applied to physics - if instead of inventing QM and Relativity or Mechanics, physicists got it in their head to try to feed raw data into a black box and see how well it predicts some observed movements.
In fact, this would be a pretty interesting experiment: how large would a deep learning model that could accurately predict what mechanics predicts get, given only raw data (object positions, velocities, masses, colors, surface roughness, shape, taste etc.)? Unfortunately, I don't think anyone has been interested in this type of experiment, because it is not useful from an engineering (or profit) perspective.
Isn't that what googles alphafold is doing pretty much?
and it seems GPT-3 formed concepts related words together without being asked, its not picking the next best word strictly as a matter of statistic probability. So why wouldn't that apply to physics simulations / chemistry etc?
feed it chemical formulas and balancing equations from old chem 101 textbooks and it will fill in the blanks and start teaching itself how those things relate just by being corrected enough, then you can see if it has any predictive value.
I think both of your points are solving different problems that what I was suggesting.
My point is that an interesting scientific question is: "is the huge size of the GPT-3 model intrinsic to the problem of NLP, or is it an artifact of our current algorithms?"
One way to answer that is to apply the same algorithms and methods to mechanics data generated from, let's say, classical mechanics; and compare the generated model size with the size of the classical mechanics description. If the model ends up needing roughly the same amount of parameters as classical mechanics, then that would be a strong suggestion that NLP may intrinsically require a huge model as well. Otherwise, it would leave open the hope that and understanding can be modeled with fewer parameters than GPT-3 requires.
Your examples are still in this realm of engineering - trying to apply the black box model to see what we can get, instead of studying the model itself to try to understand it and how it maps to the problem it's trying to solve.
I think most people are still aware of the problem. I've seen a few people go off the rails, but most people have been well aware that "GPT-3 is having a conversation with me!" isn't itself particularly interesting.
On the other hand, we should acknowledge that humans are also structurally wrong for most of the domains we work in. A general-purpose neural network isn't a great tool to diagnose cancer, certainly - but it doesn't have to be great to exceed some radiologist's general-purpose light detectors. I think GPT-3 starts to edge into the territory of demonstrating Dreyfus was substantially wrong, and recognizably computer-like architectures are fully capable of doing abstract reasoning.
(That's not to knock on Dreyfus! Other voices in his era were optimistic to an absurd degree, and "come on guys our computers aren't that smart" was a very necessary response.)
For instance, understanding language requires some of the capabilities of a SAT solver. This was something everybody believed in 1972, but today is denied.
Fundamentally "understanding" problems require the ability to consider multiple alternative interpretations of a situation, often choose one or work with the incomplete knowledge you have.
Back in the 1970s we had intellectually honest people like Richard Dreyfus writing books like "Things Computers Can't Do" that describe many specific ways the architecture at the time fall short. People on GPT-3 are working in a way that is academically valid (able to make results that are meaningful to a community) but from engineering it is like building a bridge with one end or a tall tower that carries no load.
GPT-3 has a structural mismatch with the domain it works in. Unlike early medical diagnosis systems like MYCIN, it is never a doctor, it just plays one on TV and it does the "passing for neurotypical" terrifyingly well.
The secret of GPT-3 is that people want to believe in it. Somebody will have it generate 100 text snippets and they will show you the three best. Your mind makes up meaning to fill up for its mindlessness. When this was going on with ELIZA in 1965 people quickly understood that ELIZA was hijacking our instinct to make meaning.
For some reason people don't seem to have that insight today, and it bothers me why that is. Back in the 1980s they had a lot of fear about compressing medical images because it could lead to a wrong diagnosis. Today you see articles in the press that are completely unquestioning that a neural network that has been trained to hallucinate healthy and cancerous tissues will always hallucinate the right thing when you are looking at a patient.