I HIGHLY recommend this talk from the HotChips conference "HC29-K1: The Direct Human/Machine Interface and Hints of a General Artificial Intelligence", specifically the next 4 minutes after this timestamp:
I really enjoyed Children of Time, and I just started the latest entry in the series a couple of days ago. Speaking of octopuses though, I highly recommend The Mountain in the Sea by Ray Nayler. It was a riveting read and dealt just as much with artificial intelligence (timely considering all the hubbub around LLMs lately) as it did with the rise of consciousness in octopuses in the near future and how humanity might react to that. I couldn’t put it down.
If Octopuses could evolve human intelligence and society in just a few generations, then why wouldn’t they have done so in the hundreds of millions of years before?
One of the longest lived octopus species we know of has a lifespan of about 5 years, even if they were more intelligent than humans they'd have a hard time establishing a society only being able to establish 5 years worth of knowledge per individual. Let's make longevity treatments for octopi and see if they learn enough to make/join society.
Think it through: If a long lived social octopus species regularly evolved and made civilization that collapsed its species, then the only similar species left would be one that is absurdly solitary and absurdly short lived. Think about it. Evolution is counterintuitive.
I’d even go so far as to say that human civilisation struggles with the longevity of humans. We’re rather short lived compared to the impacts our actions can have, as a technologically enabled and socially civilised species.
Life really is short. If our lifespan were twice or ten times longer, the whole thing would be a different ballgame.
At least it's long enough for us to recognize it and use it as a source of ambition to make the most of it. I've recently turned 40 and I have a mild panic that I haven't done anything worthy, but there's still time ahead.
Octopuses will never be able to evolve anything similar to an human societies. To start because they are fiercely solitary and glad to jump to cannibal behavior at the first opportunity. Is a non-social species that will not cooperate with other individuals. This is not a question of how much time you put in the mix. They are a different animal that wants different things than us.
Think it through: If a long lived social octopus species regularly evolved and made civilization that collapsed its species, then the only similar species left would be one that is absurdly solitary and absurdly short lived. Think about it. Evolution is counterintuitive.
I wouldn't expect that. Primates cooperate all the time with birds, rodents and deer for food or protection. We are brain-wired to search for allies inter and intraspecific. Early humans would most probably do the same.
Cephalopods, on the other side, had never been reported cooperating with other species. We still don't know a lot about many species of course so that could change, but even the squids that are strongly gregarious do it in a lose "save yourself" way.
There is a sci-fi book that covered this a little, its the Children of Time series, rather good look into a "uplift" virus going wrong. (used in Children of Ruin, the second in the series on Octopuses)
I thought the same thing. I loved how their species was described as one giant problem solving machine that doesn't really know what it's doing. Also loved Gothy and Gethly's theory of sentience at the end of the book.
I wonder what would happen if you prompted an LLM with something other than human language, like a large upload of patterns from a cephalopod brain. Could it translate concepts?
But, considering that the most massive LLMs do not even attempt to abstract and manipulate even basic concepts (they just model likely sequences of words using insanely large data sets), that would be a definitive "No.".
Plus, the fact that we have no idea how to map even human neural activity patterns to concepts, also "No.", it'd be even harder to figure out the octopus' conceptual map.
>the most massive LLMs do not even attempt to abstract and manipulate even basic concepts
then how does this work?
H: translate the following sentence to instead use olympic curling related metaphors ( do not explicitly mention curling, and do not use similes in forming the translation ): man:"I've been spinning my tires all day without getting anywhere" woman:"running out of gas?" man:"nah, I'm gassed up, geared up, oiled and revving. I just can't seem to get in the right lane, you know?" woman:"anything knocking?" man:"nah, I'm tuned just fine. I can't seem to get my GPS to connect, you know?" woman:"take a few laps without a destination, see if you can figure out the lay of the land" man:"you know, that might help. thanks barbara"
R: man: "I've been sweeping the ice all day without hitting the target."
. woman: "Losing momentum?"
. man: "Nah, I've got the energy, the technique, and the focus. I just can't seem to find the right path, you know?"
. woman: "Anything coming in your way?"
. man: "Nah, I'm on point with my aim. I just can't seem to get my strategy to align, you know?"
. woman: "Maybe glide a few times without aiming specifically, see if you can figure out the ice conditions."
. man: "You know, that might help. Thanks, Barbara."
I find it unlikely that these larger models do anything less than abstract and manipulate concepts.
( my apologies to anyone that has seen me paste this same example more than once, I find it a very good and succinct example for the topic )
Ask it about actual concepts of specific domains, not merely idioms. If anything, these are idiom engines.
In the middle of the road, it outputs very sensible, grammatical, and mostly accurate text. It looks impressive, and can be genuinely helpful.
As the same questions get into more obscure stuff, it starts making errors, then on anything near advanced cases, it hallucinates/confabulates/lies.
Even with simple math, you get this:
Q: what is 17 plus 34?
A: 17 plus 34 equals 51. (correct)
Q: what is 297438 plus 927465?
A: 297438 plus 927465 equals 1224903. (correct)
Q: what is 38460 times 2398756?
A: 38460 times 2398756 equals 9218928960. (wrong, correct is 92256155760)
It is merely outputting a very grammatical word salad that has the same relationship to the actual concepts as a broken clock to the time (right twice a day; this is just better with insanely more parameters).
In the areas where it has ingested massive amounts and has billions of relevant vectors, it'll output good stuff. But in the edges, where the input is more scarce, it'll just guess, and that is were you see that there is no abstraction there.
I've observed it doing the same thing with legal and scientific topics that I have or know experts. When there is a great body of text that it has ingested, the predictions and grammar engine and idiom are amazing, sometimes astonishing, and often useful.
But the kinds of errors show that it has no shred of a concept. It if did, it would not have the kind of problems with facts. We would not get entirely hallucinated/confabulated references to entire papers, authors, APIs. etc., including math answers. We'd get something like "I should have that knowledge, but it is out of scope".
Your example is very cool, but pretty much the same as translating from English to French, it's just car idiom to curling idiom. When the LLM translates "Car" to "voiture", it is not abstracting the concept of a car, it is making a prediction. When we translate as novices in a second language, we have to hunt around for the vocabulary equivalent, and when we speak a second language, we aren't translating, but speaking from our own concepts. LLMs are not human, and they learn and produce nothing like humans, but humans have an extremely powerful tendency to anthropomorphize and infer qualities that are not there.
Also, from an entirely different angle, in every paper on LLM technology I've read or skimmed, I've seen zero reference to any technology that would abstract and wield concepts.
So, I find it extremely unlikely that these models even come close to anything resembling use of abstractions.
Thanks, that's good news, and looks like a step in the right direction.
they say:
>>Embeddings are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts.
Beyond that claim, there is nearly no information on how this works, and the examples give hints of not any kind of concept, but an additional abstraction layer of clustering words with similar meanings. It is not the abstraction of an understanding, but a more far-reaching thesaurus lookup.
(edit) It's an astonishingly good map of words, and this makes it better. The clusters may sort of map resembling concepts, but the kinds of errors it makes show that it's not actually wielding concepts to think.
That said, a journey of a thousand miles begins with a single step, and this looks like a good step.
Why not? LLM convert words into vectors and then discover relationships between them. If it is possible to convert brain patterns into vectors, I can see how an LLM could detect patterns between those too.
Converting the brain pattern vectors back to words could be difficult, but might be possible if each brain pattern vector was annotated with a description of the organism's current environment and the organism's current actions.
I'm curious when you say LLM convert words in to vectors, is it similar to vectors in physics, with one part being a magnitude and the other part being a direction?
The magic is really in how absurdly high-dimensional those vectors are. The dimensionality of the latent space in current breed of LLMs is, IIRC, on the order of hundred thousand dimensions.
Now, even if all the model does is 1) turn your prompt into high-dimensional vectors, 2) run an adjacency search to find the vectors in the latent space nearest to your vector, and 3) translate those vectors back to tokens, it's more than enough for it to work with concepts. Again, we're talking 1000 to 100 000 dimensional vectors here. Any kind of semantic similarity between words you can think of (tree - green - grass, tree - tall - skyscraper, tree - data structure, tree - files, etc.) can fit in there - the relevant words (tokens) will be close together along some dimensions. So, if you pick a vector somewhere in the latent space, and look around (in hundred thousand dimensions) for its nearest neighbors, the group of points you'd be looking at is, IMHO, a concept in its raw form.
It's similar - basically here a vector/tensor is an array of magnitudes across N dimensions. Whereas in (undergrad-level) physics you might have a 3- or 4-D vector for spatial dimensions and time, here the LLMs are embedding sequences of tokens into N-dimensional space where N is much much larger.
There's a pattern called "embedding search" - you precalculate a set of embeddings for a corpus of text. Then to do a search, you calculate embeddings for your search string. Then you can find the closest vector in that N-dimensional space, which finds you the semantically closest neighbor from the original corpus.
For an embedding search - the OpenAI Embeddings API gives you a ~1500 dimension output vector. When a LLM is working with input text as a vector, I am not sure what the tokenizer is actually feeding into the model. Hopefully someone else can chime in!
Vectors in this case are more like an array of numbers, often in the thousands, where each number represent a concept associated with the word. So for example, "elephant" will have have values in the cells representing "animal", "grey", "big", "noun", etc... but low values in "verb", "abstract", "flight", etc... The meaning of these cells is usually not explicit, it is something that emerges from the model learning process, but since these models are usually trained on human language, you often find human concepts in there.
And indeed, mathematically, they are the same as vector in physics, just with a lot more dimensions. And indeed there is a concept of magnitude (norm) and direction. Some processes only use the "direction", throwing away the magnitude through a process called normalization, some don't care, and some actually make use of the magnitude.
Imagine sitting in a river side restaurant, with a pizza fruit del mare, and a octopus in a mobile aquarium rolls, by and starts yelling at you for being a sapient eater then settles down, to order a pizza homo.
https://youtu.be/PVuSHjeh1Os?t=1168
Also this: "Flashes of Insight: Whole-Brain Imaging of Neural Activity in the Zebrafish"
https://youtu.be/eKkaYDTOauQ