Language models can generate novel functioning protein structures that adhere to a specified purpose. Structures that didn't exist before nevermind in the training set. https://www.nature.com/articles/s41587-022-01618-2
Sorry but the idea that they can't generate new things is blatantly false.
This isn't generating new theories. I'm not saying LLMs aren't useful, or that they aren't extremely impressive in many ways.
Almost nothing GPT-4 produces was in the training set. I'm not saying they can't produce new sentences.
From the linked paper: "a language model that can generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences"
Nobody is disputing that LLMs can generate grammatically and semantically correct sentences, and from the sounds of things this task is analogous to generating sentences. This isn't the same as generating new theories.
If it can take natural language purpose descriptions and generate new functioning protein then due to it's training, it evidently understood the relationship between purpose and protein structure. Meaning all the little models it builds as a result of training are concrete. They can produce new things and new information and it's just matter of scale and intelligence.
"akin to generating grammatically and semantically correct natural language sentences" is just an indicator of how it is synthesizing and producing this information. To say it is analogous to generating sentences in the abstract is like saying generating new theorems is analogous to generating sentences. Well yeah, that's what the language model is doing. The power lies in what it is generating. What they are telling you is that they skipped the folding process entirely and just let the model figure out function to structure directly.
Either your dog fetches what you throw at it or not. The idea of "pretend fetching" is unbelievably silly. Doubly so when no one seems to be able to actually define this act. A distinction that can't be tested for is not a distinction. Science is concerned with results not vague and ill defined assertions.
Imagine an LLM was trained on the internet before LLMs were invented. It knows everything in the world. Every programming concept, every philosophical and logical concept. It is an "expert" on the then-current state of the art in machine learning.
Would it be able to invent itself?
I'm not moving the goalposts here. Language models are extremely impressive. I think they will change the world. I fear they will displace many jobs.
Maybe some day they will be able to invent themselves. Maybe that will come sooner than we think. As it stands, they couldn't. This is the distinction that you say doesn't exist.
First of all, you've just imagined a scenario, invented a conclusion that you can't prove and justified your distinction on your invented conclusion. Your distinction is no less arbitrary than when this conversation started.
Next, you're conflating different things. How many humans are capable of inventing anything at all ? Have you invented anything in your field of expertise? Should i then question everything you do understand bout your field if you haven't ? Do you think understanding is binary ?
Ok, I feel this conversation is not going anywhere. If I see a single example of an LLM generating new theories then I will immediately change my mind. I have yet to see this.
Most humans are not capable of inventing things, you're right, but humans in aggregate are. We are comparing LLMs to "humans" here, not to "a human". There are no examples of LLMs doing this. Maybe in a couple of years, who knows.
Sorry but the idea that they can't generate new things is blatantly false.