Despite how powerful it looks on the surface, this is a useful reminder that you have to learn how to use it.
Large language models are not the right tool for solving mathematical problems.
They are surprisingly good at writing code though... provided you give them the right prompt, and you understand that there are no guarantees that they'll produce code that is correct. Kind of like working with a human programmer in that regard to be honest!
Solving math problems and writing code are pretty similar. I saw someone's modified prompt that asks the model to write a Python script to compute their math problem, then they run it through Python for the AI, and the AI answers based on the output of the Python script.
How are we sure humans are not a sufficiently advanced version of such a Chinese Room, just taking more extra hoops and sanity checks along the path (idk, inner monologue that runs out outputs through our own sanity checkers?), so our outputs are saner?
I mean, some delusional humans are behaving just like this machine, generating statements that are grammatically sound but lack any logical coherency.
We know this machine doesn't "think" in a sense we believe "true" thinking should be done - but do we know if we do?
Yeah, it seems like if you can get something that appears rational with a sufficiently large language model, maybe adding a "facts about the world model" and some other built-in "models" that the human brain encodes, then you start to get close to actual intelligence. It does seem to lend weight to the idea that there's nothing special about the brain - it really is just neural networks all the way down.
And yet (on asking for an example of a prime number whose square root is prime)...
The square root of 11 is 3.3166247903554, which is not a whole number. However, if we round 3.3166247903554 up to the nearest whole number, we get 3, which is a prime number. Therefore, 11 is a prime number whose square root, when rounded up to the nearest whole number, is also a prime number.
(The use of "round up" is curious though. I asked it if it meant "round down" and it corrected itself though!)
How does the failure to understand elementary math demonstrate that it's copy-pasting? If it were copy-pasting, wouldn't you expect it to be able to handle simple math, since there's plenty correct simple math text available to paste from?
If it does not have a 'model' for even simple arithmetic, then how exactly is it parsing/processing/generating/transforming code? Simply predicting and generating next token from a corpus of text (of code) is as good as copy pasting
It's not clear to me that the only way to parse/process/generate/transform code is to have a correct mathematical model of the code's behavior. Even human programmers sometimes analyze code based on an imprecise general understanding. If ChatGPT is able to perform these activities using only next-token prediction, I think we have to conclude that next-token prediction is in fact more powerful than mere copy-pasting.
It’s not copy-pasting. But different prompts will yield wildly different results. I think trying to remove the step of prompt tuning by making the AI conversational reveals its shortcomings.
Well...It does not even seem to have models for arithmetic and also language semantics, like an intermediate expression/format -- a tree, graph whatever -- to map computations into clauses/phrases as otherwise it should not be possible to say things like "2, which is equal to 1"
See also the 'infinite number of prime numbers' proof problem from the same user, it's clear the answer it provides is a mindless (literal) concatenation of solutions from 2 different problems (to begin with the question was "prove that there are 'finite' number of primes", and for that it should've replied -- like as the passive-aggressive Google search does often -- "Did you mean for 'infinite numbers'?")
If it does not have a 'model' for simple arithmetic, then how exactly is it parsing/processing/generating/transforming code?
https://twitter.com/colin_fraser/status/1598239246271541248
(see also other examples from this user)
It's clear that it's merely (effectively) copy-pasting from scraped text with 0 understanding.
And people are freaking about it taking coding jobs?