Transformers seem to mimic parts of the brain

data_maan · on Sept 12, 2022

This whole area of research is more hype than substance.

First note that what they tout as being "the brain" is actually just a very very simplified model of the brain. If you really want to model the brain there's a hierarchy of ever more complicated spiking neural nets. Even simulating a single synapse in full detail can be challenging.

Having said that, the fact that some models used in practice have been found to be equal to a neuroscientific model is not really that impressive since it does neither explain the inner working of the brain, nor modern ML models. Unfortunately, Quanta magazine editors are riding too much on the hype wave to notice that.

Note also that Whittingon's other work on predictive coding networks also is not really solid. It was a pretty irritaring experience to read some of his work. That makes me skeptical of how rigorous his claims are in this case.

westoncb · on Sept 12, 2022

> Even simulating a single synapse in full detail can be challenging

I wonder how common it is for serious commentators on NN / brain relationship to hold this supposition that in order for the two to be called... let's say "functionally similar/equivalent", that there would have to be some kind of structural equivalence in their most basic parts.

Neurons (and their synaptic connections etc.) developed in a biochemical substrate which is going to bring a certain amount of its own representational baggage with it, i.e. elements which are loosely incidental to "what really matters" in creating the magic of the brain—and we should not expect those features to reappear in artificial NNs (as they are by definition incidental): bringing them up could only establish a trivial non-equivalence imo.

I'd like to see more discussions about NN / brain relationship mentioning which level/kind of equivalence they're refuting/confirming when refuting/confirming.

lossolo · on Sept 13, 2022

That has never been thought by anyone but computer scientists who never looked at a biology textbook. To begin approximating what a lone spherical synapse would actually do you'd need to solve 2^n coupled second order differential equations where n is the number of ions used.
That is before you throw in things like neuro transmitters and the physical volume of a cell. Simulating a single neuron accurately is beyond any super computer today. The question is how inaccurately can we simulate one and still get meaningful answers.

Then how we do it 100e9 more times.

source: https://news.ycombinator.com/item?id=32407028

There is interesting discussion there.

RandomBK · on Sept 13, 2022

While I'm neither a biologist nor a CS PhD, I want to call out the fallacy that simulating a system to a sufficient degree requires simulating each individual molecule in exacting detail.

We've gotten quite far with ideal gas laws without needing to simulate every particle, we used kerosene to get us to the moon without needing to simulate all the reaction species of kerosene combustion, etc.

data_maan · on Sept 13, 2022

> there would have to be some kind of structural equivalence in their most basic parts.

Of course there doesn't have to be such an equivalence and I didn't whatn to imply one. What I did what to imply though was that unless there is some relationship between NNs and the brain there is no meaningful way to translate results from one to the other. And currently, AFAIK, we do not have a good "dictionary" for that. Something like what RandomBK has mentioned is still missing.

That being said I would also like to see more NN/brain relationship discusssions. Currently the discussions are really at a super basic level, there were a number of papers out there whether "the brain" does backpropoagation, which was pretty useless science because, again, the brain was modelled in a pretty crudenway. (The literature is huge and I don't claim being omniscient, so perhaps there is something out already there.)

napier · on Sept 13, 2022

Intentional/incidental biomimicry of high level neural behaviors and structures by ML/AI researchers is hardly a new kid on the block. Sure, with neuron centered approach research is still at the level of discerning urban activity by looking at lighting patterns, but it has obvious practical value even if transformers and diffusion are only loose approximations of what's actually running in wetware.

data_maan · on Sept 13, 2022

What would the obvious practical value be? It's not so obvious to me, I would rather say "limited practical value"

napier · on Sept 14, 2022

General purpose transformers and diffusion models are of limited practical value? Or biomimicry inspired research and design in general?

data_maan · on Sept 15, 2022

You were talking about the practical value of comparing Transformers etc. to neural structures: "neuron centered approach research is still at the level of discerning urban activity by looking at lighting patterns, but it has obvious practical value".

This type of research has limited practical value, for the reasons outlined in various comments I made here. Transformers etc. have a lot of practical values.

vanderZwan · on Sept 12, 2022

Sincere question inspired purely by the headline: how many important ML architectures aren't in some way based on some proposed model of how something works in the brain?

(Not intended as a flippant remark, I know Quanta Magazine articles can generaly safely be assumed to be quality content, and that this is about how a language model unexpectedly seems to have relevance for understanding spatial awareness)

abeppu · on Sept 12, 2022

I think the response to this has two prongs:

- Some families of ML techniques (SVMs, random forests, gaussian processes) got their inspiration elsewhere and never claimed to be really related to how brains do stuff.

- Among NNs, even if an idea takes loose inspiration from neuroscience (e.g. the visual system does have a bunch of layers, and the first ones really are pulling out 'simple' features like an edge near an area), I think it's relatively uncommon to go back and compare specifically what's happening in the brain with a given ML architecture. And a lot of the inspiration isn't about human-specific cognitive abilities (like language), but is really a generic description of neurons which is equally true of much less intelligent animals.

soraki_soladead · on Sept 12, 2022

> I think it's relatively uncommon to go back and compare specifically what's happening in the brain with a given ML architecture.

Less common but not unheard of. Here's one example, primarily on focused on vision: http://www.brain-score.org/

DeepMind has also published works comparing RL architectures like IQN to dopaminergic neurons.

The challenge is that its very cross-disciplinary and most DL labs don't have a reason to explore the neuroscience side while most neuro labs don't have the expertise in DL.

westurner · on Sept 13, 2022

Is it necessary to simulate the quantum chemistry of a biological neural network in order to functionally approximate a BNN with an ANN?

A biological systems and fields model for cognition:

Spreading activation in a dynamic graph with cycles and magnitudes ("activation potentials",) that change as neurally-regulated heart-generated electron potentials (and,) reverberate fluidically with intersecting paths. And a partially extra-cerebral induced field which nonlinearly affects the original signal source through local feedback; Representational shift.

Representational shift: "Neurons Are Fickle. Electric Fields Are More Reliable for Information" (2022) https://neurosciencenews.com/electric-field-neuroscience-201...

Spreading activation: https://en.wikipedia.org/wiki/Spreading_activation

Re: 11D (11-Dimensional) biological network hyperparameters, ripples in (hippocampal, prefrontal,) association networks: https://news.ycombinator.com/item?id=18218504

M-theory String theory is also 11D, but IIUC they're not the same dimensions

Diffusion suggests fluids, which in physics and chaos theory suggests Bernoulli's fluid models (and other non-differentiable compact descriptions like Navier-Stokes), which are part of SQG Superfluid Quantum Gravity postulates.

Can e.g. ONNX or RDF with or without bnodes represent a complete connectome image/map?

Connectome: https://en.wikipedia.org/wiki/Connectome

westurner · on Sept 13, 2022

Wave Field recordings are probably the most complete known descriptions of the brain and its nonlinear fields?

How such fields relate to one or more Quantum Wave functions might entail near-necessity of QFT: Quantum Fourier Transform.

When you replace the Self-attention Network part of a Transformer algorithm with classical FFT Fast Fourier Transform: ... From https://medium.com/syncedreview/google-replaces-bert-self-at... :

> > New research from a Google team proposes replacing the self-attention sublayers with simple linear transformations that “mix” input tokens to significantly speed up the transformer encoder with limited accuracy cost. Even more surprisingly, the team discovers that replacing the self-attention sublayer with a standard, unparameterized Fourier Transform achieves 92 percent of the accuracy of BERT on the GLUE benchmark, with training times that are seven times faster on GPUs and twice as fast on TPUs."

> > Would Transformers (with self-attention) make what things better? Maybe QFT? There are quantum chemical interactions in the brain. Are they necessary or relevant for what fidelity of emulation of a non-discrete brain?

> Quantum Fourier Transform: https://en.wikipedia.org/wiki/Quantum_Fourier_transform

data_maan · on Sept 13, 2022

The QFT acronym annoyingly reminds ne rather of Quantum Field Theory than Quantum Fourier Transforms ...

westurner · on Sept 13, 2022

Yeah. And resolve QFT + { QG || SQG }

westurner · on Sept 16, 2022

A more useful query, to Google dork:

/? QFT "field theory" "Superfluid" "quantum gravity" https://www.google.com/search?q=QFT+%22field+theory%22+%22Su... https://scholar.google.com/scholar?q=QFT+%22field+theory%22+...

/? QFT "field theory" "Superfluid quantum gravity" https://www.google.com/search?q=QFT+%22field+theory%22+%22Su... https://scholar.google.com/scholar?q=QFT+%22field+theory%22+...

mtlmtlmtlmtl · on Sept 12, 2022

AlphaGo is a pretty good example here. It uses a neural net for evalutation, and that's vaguely inspired by the brain, sure. But it employs a Monte Carlo based game tree search which is probably very different from how humans think.

In addition it learns by iterated amplification and distillation: it plays games against itself, where one player gets more time, hence will be a stronger player(amplification). The weaker player then uses this stength differential as a fitness function to learn(distillation). Rinse and repeat. That's really nothing like how humans learn these games. While playing stronger players and evaluating is a huge part of becoming stronger, there's also a lot of targeted exercises, opening/endgame theory, etc. Humans can't really do that type of training at all.

kadoban · on Sept 12, 2022

> But it employs a Monte Carlo based game tree search which is probably very different from how humans think.

It's not _that_ different from how humans play. We have pattern matching that points out likely places, we read out what happens and try to evaluate the result. Humans are just less methodical at it really.

svnt · on Sept 12, 2022

Parent said “different from how humans think”, not play, which seems key. Your description is very broad.

These machines don’t seem to carry narratives or plans yet (if they would benefit from them or be encumbered by them seems to be an open question).

Watching the machines play they have zero inertia. If the next opportunity means a completely inverted play strategy has a marginally better chance of winning, they will switch their entire approach.

Humans don’t typically do this, although having learned from machines that it can produce better outcomes perhaps we will start moving away from this local maximum.

kadoban · on Sept 12, 2022

> Watching the machines play they have zero inertia. If the next opportunity means a completely inverted play strategy has a marginally better chance of winning, they will switch their entire approach.

In Go, especially at the high level, this isn't that far outside of the norm. In particular, you see players play in other areas (tenuki) at what to a weaker player would look like pretty random times, depending on what's most urgent or biggest.

Computer go players aren't too chaotic. They're just _very_ good at some things that are already high-level-player traits. A computer will just give you what you want, but suddenly it's just not actually that good. It feels like Bruce Lee's flow/adaptation based fighting style applied to a go board.

mtlmtlmtlmtl · on Sept 12, 2022

I mean I guess you could argue some calculation kind of looks like a type of random walk (with intuited moves) based search. But that's kind of all AlphaGo does, and it does it so efficiently that's all it really needs to do.

I'm not a go player, but at least in chess, which is game theoretically very similar modulo branching factor, human thinking is much more of a mish mash of different search methods, different ways of picking moves, and strategic ideas(which I like to think of as sort of employing something more akin to A* or Dijkstra).

I.e there's a rough algorithm like this happening

1. Asses the opponent's last move, using some sort of abductive reasoning to figure out what the intent was and whether there's a concrete threat. If so, try to refute the threat(This can sometimes be a method of elimination search(best node search is a similar algorithm) if the candidate moves are few enough, or a more general one if not), find counterplay, find the lesser evil, or resign

2. If not, do you want to stop their plan or is it just a bad plan?

3. If you do, how?

4. If not, do you have any tactical ideas? search all the forcing moves in some intuitive order of plausibility and play the strongest one you find

5. If not, what is your plan? If you had a plan before, does it still make sense?

6. If not, find a new plan

7. Once you have a plan, how do you accomplish it? Break it into subgoals like "I want to get a knight to e5"

8. find the shortest route for a knight to get to e5(pathfinding while ignoring the opponent)

9. is there a tactical issue with that route?

10. rinse and repeat until you find the shortest route that works tactically.

I could probably elaborate this list for hours, getting longer and longer. But you probably get the idea at this point.

kadoban · on Sept 12, 2022

You are definitely right that computer players are missing some kind of narative-based reasoning for their own moves and for their oppoents' moves. In go it doesn't feel that extreme though. We're taught not to hold too hard to our plans anyway, and most good moves from the opponent will have more than one intention. So you can't get that far relying just on reading what their goal is.

How computers think isn't exactly how we do, for go, but it's close enough to rhyme pretty heavily imo.

robotresearcher · on Sept 13, 2022

That's a functional but not architectural claim.

fzliu · on Sept 12, 2022

Saying that neural networks are "similar to" or that they "mimic" the human brain can be misleading. Today's architectures are the byproduct of years of research and countless GPU-hours dedicated to training and testing architecture variants. Many neuroscience-based architectures that mimic the brain better than transformers end up performing much worse.

The Quanta article is overall pretty reasonable, but I've unfortunately seen other news outlets regurgitate this kind of blanket statement for the better part of a decade. The very first models were perhaps 100% inspired by the brain, but today's ML research more or less follow a "whatever works best" principle.

trhway · on Sept 12, 2022

The convolution kernels in the first layers of AlexNet and all its DL image processing descendants converge to the Gabor filters (or some variation of) which are the response functions of the neurons in the first layer of the visual cortex. About 15 years before AlexNet there were works showing that such type of filter is kind of mathematically optimally encoding for the feature based image processing. (So, theoretically one could have just pre-generated the first layers in the net and use them fixed thus cutting significant time/effort on the training - i myself wanted to do it 20 years ago, yet just didn't get to it :)

I'm pretty sure that for middle layers in image DL as well as for transformers in lang we have a kind of similar optimality, ie. something like maximum entropy filter(separator/aggregator at higher levels?) at a given level of granularity/scale like Gabors at the first feature level.

data_maan · on Sept 13, 2022

Could you provide references for your statements

> the first layers of AlexNet and all its DL image processing descendants converge to the Gabor filters (or some variation of)

and

> 15 years before AlexNet there were works showing that such type of filter is kind of mathematically optimally encoding for the feature based image processing

?

trhway · on Sept 13, 2022

For the first - you can just look at the original AlexNet paper. The kernels are unmistakably strikingly Gabor-like. Some differences, like cross-color, are kind of giving rise to possibly interesting questions - is it improvement or deficiency(i.e. more training would correct) over biology? or may be it is just real-valued projection from the [plausible] fact that the optimal is complex-valued?

>?

I don't have that specific reference i had in mind that was published 15-20 years ago, yet you can trace that line of thought development through the works like these for example (there have been a bunch of them in the 199x and into 200x) :

1990 - https://opg.optica.org/josaa/abstract.cfm?uri=josaa-7-8-1362

1998 - https://pubmed.ncbi.nlm.nih.gov/12662821/

macrolocal · on Sept 12, 2022

Transformer networks have deeper connections to dense associative memory. For example, the update rule to minimize the energy functional of these Hopfield networks converges in a single iteration and coincides with the attention mechanism [1].

[1] https://arxiv.org/abs/1702.01929

macrolocal · on Sept 14, 2022

More accessible references:

https://mcbal.github.io/post/an-energy-based-perspective-on-... (Modern continuous Hopfield networks section)

https://arxiv.org/abs/2008.02217

Note that the connection to Hebbian learning hinges on the softmax function, in particular its exponential!

rdedev · on Sept 12, 2022

I think at some point similarities will naturally emerge. Smart moves in design space. That being said these similar designs will probably be minuscule compared to the overall architecture

edgyquant · on Sept 12, 2022

It is my opinion that pretty much all architectures already exist in the brain for some use or another. Otherwise we wouldn’t be able to reason about them

coldtea · on Sept 12, 2022

We can reason about all kinds of things that don't exist in the brain...

edgyquant · on Sept 13, 2022

For one You can’t actually prove this. We use brains to experience everything so we (humans as a species) can never know.

coldtea · on Sept 13, 2022

What does that even mean?

We can reason about fusion reactors, yet there's no such mechanism in our brains... How's that for a proof?

robotresearcher · on Sept 13, 2022

When we're reasoning about them, where are they?

coldtea · on Sept 13, 2022

Still outside. It's our thoughts about them, models, and concepts that are in the brain.

Plus, even if it was true, this wasn't the parent's point (that things exist in the brain while we're reasoning about them). His point (also wrong) was that different architectures must exist as structures in the brain (not as concepts we think and memories etc., but as parts of brains matter organization and wiring) for us to be able to reason about it.

dunefox · on Sept 12, 2022

> Otherwise we wouldn’t be able to reason about them

That's a bold claim.

hackernewds · on Sept 12, 2022

I agree but not for the "otherwise" reasoning

I think it speaks to the complexity of the brain almost like every combination of numbers exists in pi

tomrod · on Sept 12, 2022

I disagree. No one built cars in 5000 BC but the confluence of ideas then have led to cars now.

edgyquant · on Sept 13, 2022

This is a non-sequitur

tomrod · on Sept 13, 2022

On the contrary, I view it as a counterexample to "all architectures already exist in the brain for some use or another," which disproves your point. Let's not make the mistake of a fallacy fallacy here!

Perhaps you would like to expound or clarify your point to rule out the edge case of cars not existing in 5000 BC, but the models to derive cars 5000 years later suddenly came into being?

adamnemecek · on Sept 12, 2022

The underlying idea is the idea of fixed points (aka spectra, diagonalizations, embedding, invariants). By fixed point I mean something like the "Lawvere's fixed point theorem".

https://ncatlab.org/nlab/show/Lawvere%27s+fixed+point+theore...

Dennis Gabor's Holonomic brain theory also indicates something like that https://en.wikipedia.org/wiki/Holonomic_brain_theory

I have a linkdump on this https://github.com/adamnemecek/adjoint

I also have a discord https://discord.gg/mr9TAhpyBW

data_maan · on Sept 12, 2022

Why do you invocate a categoric theorem when the theory that is discussed on Quanta has a manifestly non-category-theoretic flavor?...

adamnemecek · on Sept 14, 2022

Because it ties it into a larger ecosystem.

data_maan · on Sept 15, 2022

Well... this is kind of a problem with category theory, isn't it? Almost every can be categorified (which, I know, is insanely popular these days). But outside of pure mathematics I'm having trouble seeing why such a categorification is really useful. Yes, it can be nice, but it often seems more like linguistic exercise than providing crucial insight.

origin_path · on Sept 12, 2022

I do wonder about the ethical issues that people are going to start facing in, I don't know, maybe 10-20 years? Maybe even sooner.

These new DNNs create very human like outputs. And their structure is rather similar to the brain. Now we are learning that maybe they're even more brainlike than we thought. At what point do we cross a threshold here and encounter a non-trivial number of people who argue that a sufficiently large model actually is a brain, and therefore deserves rights? Blake Lemoine went there already but I wonder if that's going to be an isolated incident or a harbinger of things to come.

It feels weird. On one hand, I know that these things are just programs. On the other hand, I also know that our brains are just molecules and cells. At the same time I feel that intelligent creatures deserve rights and the exact way the underlying brain works doesn't seem especially important. Especially if people start getting brain augmentations at scale a la Neuralink. At what point do you say the line is crossed?

ok_dad · on Sept 12, 2022

I personally think we'll have the opposite problem. I think making machines too much like humans will result in less ethical consideration of actual humans, not extra rights for machines.

Once a model becomes sufficiently like a human brain to perform many of the jobs we have in our society (driving vehicles, collecting trash, monitoring cameras and data streams, etc.) then those who have the ultimate power in our world will start to see humans as unimportant meat sacks that are inefficient compared to the machines. They'll stop pretending they care about the welfare of humanity even a little bit, and will start to push for policies that would reduce our numbers.

Eventually, I believe the end goal will be for a small number of humans to do those jobs that simply cannot be automated, and to serve an even smaller number of masters who control the machines, which will then be purposed as the police for the other humans.

kevinventullo · on Sept 12, 2022

You write in the future tense, but I think that this is already happening. Fewer millenials are having children than previous generations, and those that are have fewer children.

I think the main driver is that most millenials simply can’t afford children, and this is a direct result of policies pushed by “those who have the ultimate power”.

Noumenon72 · on Sept 12, 2022

Or maybe the millennials themselves see humans as unefficient meat sacks who might as well just pass the time with video games rather than striving.

Or maybe people actually think every child is precious, and those who have the ultimate power benevolently used their power to require every child get the resources it deserves, even though we can't afford it.

kevinventullo · on Sept 13, 2022

Ah, so the “essential workers” making minimum wage don’t deserve to have kids because they should have been striving more. I see the plan is working!

Also, if those in power require every child get the resources it deserves, doesn’t that make it easier to have kids?

doliveira · on Sept 12, 2022

Tech bros are already shamelessly talking about "artificial wombs", so... The Overton window is already open to that point

behnamoh · on Sept 12, 2022

That's like an interesting movie/show plot that I'd love to watch!

heyitsguay · on Sept 12, 2022

There are no actual ethical issues with contemporary ML architectures (including transformers) being too conscious or brainlike, it's just laypeople and demagogue chatter. It could be an issue in the future but only with computational systems that are utterly unlike the ones being created and used today.

Actual practitioners are tired of the debate because getting to an informed answer requires grokking some undergraduate-level math and statistics, and nobody seems particularly willing to do that

origin_path · on Sept 13, 2022

That's not really an argument, is it? I do understand the maths behind these models, or at least a good chunk of it. I can talk about differentiable functions, softmax etc.

Regardless, going from "it involves maths" to "therefore there are no actual ethical issues" is a non-sequitur. Your conclusion doesn't follow from your premise. I actually agree with you - I don't think current transformer models like DALL-E are conscious or worthy of rights - but what you've presented here isn't something that could possibly convince anyone in either direction. You'd have to lay out why it's not and how many years of study of maths are involved really doesn't seem like a convincing basis on which to decide.

data_maan · on Sept 12, 2022

> nobody seems particularly willing to do that

This. The amount of mathematical illiteracy is staggering in ML.

heyitsguay · on Sept 12, 2022

The professionals I've met actually working in ML R&D have basically all been very technically competent, including in mathematics. It's more the people who talk a lot about AI in grandiose and anthropomorphized terms that I was referring to.

Jensson · on Sept 12, 2022

> The professionals I've met actually working in ML R&D have basically all been very technically competent, including in mathematics.

Competent at math can mean many different things. Have they taken higher level courses in statistics, probability, optimization and control theory? If not I'd say that they aren't technically competent at math that is relevant to their field, and in my experience most don't know those things.

data_maan · on Sept 12, 2022

I work at one of the best unis in the world in a big ML research group and I have not. Unfortunately.

I even know researchers with 10k+ citations in ML that even talk about "continuous maths" and "discrete maths". This pretty much sums up their level of mathematical sophistication and ability.

heyitsguay · on Sept 12, 2022

What do you mean? That's an incredibly important distinction in understanding mathematics for ML in the neural net age. Perhaps a bit of a sensitive spot for me personally, coming from a harmonic analysis group for my math PhD, but the short version basically goes like: Up until the 2010s or so, a huge portion of applied math was driven by results from "continuous math": functions mostly take values in some unbounded continuous vector space, they're infinite-dimensional objects supported on some continuous subset of R^n or C^n or whatever, and we reason about signal processing by proving results about existence, uniqueness, and optimality of solutions to certain optimization problems. The infinite-dimensional function spaces provide intellectual substance and challenge to the approach, while also being limited in applicability to circumstances amenable to the many assumptions one must typically make about a signal or sensing mechanism in order for the math model to apply.

This is all well and good, but it's a heavy price to pay for what is, essentially, an abstraction. There are no infinities, computed (not computable) functions are really just finite-dimensional vectors taking values in a bounded range, any relationships between domain elements are discrete.

In this circumstance, most of the traditional heavy-duty mathematical machinery for signal processing is irrelevant -- equation solutions trivially exist (or don't) and the actual meat of the problem is efficiently computing solutions (or approximate solutions). It's still quite technical and relies on advanced math, but a different sort from what is classically the "aesthetic" higher math approach. Famously, it also means far fewer proofs! At least as apply to real-world applications. The parameter spaces are so large, the optimization landscapes so complicated, that traditional methods don't offer much insight, though people continue to work on it. So now we're just concerned with entirely different problems requiring different tools.

Without any further context, that's what I would assume your colleague was referring to, as this is a well-understood mathematical dichotomy.

data_maan · on Sept 13, 2022

I made continuous/discrete distinction more in order to take a jab at people that don't know measure theory and therefore think these approaches can't be unified. (Though I do know for the record that in some cases, like the ones you mention, there is no overarching, unifying theory.)

Other than that I agree with you with everything up to the point where you say "Without any further context...".

The dichotomy that you describe needs graduate-level mathematics to be properly understood in the first place. I'm not sure why (luck?) it seems you are biased by being surrounded by people who are competent at ML and math. I guarantee you that is not the case. If you review for any of the big conferences (e.g. NeurIPS) you will see that really fast. :(

abeppu · on Sept 13, 2022

I don't think the undergrad math and statistics are really the problem. I think even if people understood the math, there would be some who questioned whether there's something epiphenomenon arising from the math, because there are more fundamental issues. Students in philosophy of mind classes were confronting these issues long before the DL boom started, with the Chinese Room argument. I.e. even if we accept as premise that there's a pre-described and deterministic set of steps that gives rise to seemingly intelligent behavior, we can be divided over what properties to ascribe to that system. Does it "understand"? Is it "thinking"? Is it a "person"? Why?

- we've never settled on a definition of consciousness

- we don't understand how biological brains create qualia

- we're not entirely in agreement over when human biological brains stop having ethical standing

- we disagree about how necessary embodiment or the material world outside the brain is to "mind"

- we disagree on what ethical standing to give to non-human animals with biological brains

Because we know so little, perhaps the only people that can believe strongly that current ML models are conscious have embraced panpsychism. But also because we know so little, including _what consciousness is_, I also don't think one can be confident that no ML model today has any consciousness whatsoever.

Recently I was re-reading some Douglas Hofstadter and in Metamagical Themas he has a dialogue about the "Careenium" in which on different (time)scales a physical system of balls and barriers appears to have qualitatively different kinds of dynamics. Hofstadter earnestly wants to use this to explain how "I" can direct or constrain in a top-down way some of the physical mechanisms that make me up while at the same time in a bottom-up way they comprise and animate me. I don't know if it was intentional, but it seems awfully similar to a beer-can apparatus that Searle somewhat dismissively uses as example of a potential computing substrate which could surely never support a "mind" no matter how it is programmed.

Two academics, both well respected, both who spent decades thinking about the mind and how it comes to be and how we should think about it, and both used an inscrutable complex physical (non-biological) apparatus to make points which are diametrically opposed about how the mind arises from more or less specific physical conditions.

I have in mind a property fwomp which I can't clearly define. But I have fwomp, and I'm pretty sure you have fwomp. I don't know where fwomp comes from, how it works or what are the necessary conditions for its persistence. From the preceding sentences, can you know with certainty that U-Nets don't have fwomp?

data_maan · on Sept 13, 2022

Well, you could write down a model for the full interaction between synapses and probably endow the whole thing with an RL-like environment so that you can model the interactions. The solutions of the model would then represent pretty exactly what a conscious being would do.

But it is hopeless to solve such a model, already writing it down would be a challenge.

There is some work in this direction to find a meaningful model by Eliasmith.

abeppu · on Sept 13, 2022

I think we're missing each other.

Whether you can write it down as a giant complex pile of more simple operations does not cut to the core issue: is consciousness a property of a computation which is present or not regardless of the physical substrate on which the computation happens, is consciousness a property of certain physical systems which isn't present even in extremely faithful computer simulations of those systems, or consciousness something else? (And how does ethical consideration of a system depend on consciousness?)

A computer simulation of weather doesn't have the actual wetness of rain, no matter how detailed the simulation is.

Addition can be present in a 4th-grader doing arithmetic, an abacus, an analogue computer or an electronic computer, even if they have different limitations on the size of numbers they can handle, at what speed, etc.

I'm going to ignore the 'full interaction between synapses' part b/c we don't know what kind of detail is needed for a computer model to capture the behavior of a conscious being. But the point is -- if you _had_ such a model, and could run it forward, people would still be divided as to whether it was actually conscious or not, because the basic properties of consciousness are not settled.

data_maan · on Sept 15, 2022

Ok, I see the point you wanted to make.

You ask "is consciousness a property of certain physical systems which isn't present even in extremely faithful computer simulations of those systems, or consciousness something else?" and later say "if you _had_ such a model, and could run it forward, people would still be divided as to whether it was actually conscious or not".

It seems to me that you thereby answer your first question with "no", as it seems that for you, by the last sentence, consciousness is a metaphysical property...

Which I would be fine with: I think we will only know more once we have an entity that looks and acts humanly (so there is reason to suppose consciousness might exist in it) and then try to understand its formal, inner workings and compare it to the human brain. No matter the outcome, it will be very interesting.

abeppu · on Sept 16, 2022

> It seems to me that you thereby answer your first question with "no", as it seems that for you, by the last sentence, consciousness is a metaphysical property...

That is absolutely not what I said. From my first post in this conversation I said "we've never settled on a definition of consciousness". In my preceding post, in describing a hypothetical, I said "people would still be divided as to whether it was actually conscious or not, _because the basic properties of consciousness are not settled_."

"People will disagree about X" does not in anyway amount to "I believe that X is a metaphysical property."

The absence of a clear definition means that people will continue to disagree regardless of the evidence about whichever physical system that you examine, because they're bringing different conceptions of consciousness to the table. If you know in advance that no evidence will be able to answer the question as currently framed, there's no need to wait for it; we must refine the question first.

aaaaaaaaaaab · on Sept 12, 2022

>At what point do we cross a threshold here and encounter a non-trivial number of people who argue that a sufficiently large model actually is a brain, and therefore deserves rights?

There are two ways of resolving this conundrum. You either give rights to computer programs, or take away the rights of people. Which option sounds more likely to you?

petra · on Sept 12, 2022

At it's heart, ethics isn't a cognitive process, but it's based on an emotional one.

We care about domesticated animals because of Oxytocin(OT), the love/connection hormone:

" Recent reports have indicated the possible contribution of OT to the formation of a social bond between domesticated mammals (dog, sheep, cattle) and humans."[1]

And sure, we'll probably create at some point an artificial creature that will release oxytocin in humans. An it's an interesting question how to design such a machine.

But most AI's ? most likely, we won't feel any strong connection with them.

[1]https://link.springer.com/article/10.1134/S2079059717030042

origin_path · on Sept 13, 2022

Mmm, perhaps. I would be very warey of any study of the form "possible contribution of molecule X to high level social process Y", that sounds very far into the junk science danger zone.

Dogs came from wolves, originally. Nothing cute or love inducing about those. They were domesticated because they were useful and the warm feelings came later.

Sheep, cattle. We raise them for wool, milk and food. It's a purely transactional form of dominance. Very few people would profess love for a cow or a sheep outside of maybe a children's petting zoo.

hooverd · on Sept 12, 2022

They're called vtubers.

quonn · on Sept 12, 2022

Feeling creatures deserve rights, since they can suffer. Intelligence is at best indirectly related, if at all.

visarga · on Sept 13, 2022

I think feeling is being approximated in reinforcement learning by assigning values to states, it's an important part of selecting the next action. Emotion is related to the expected reward from the current state. So an RL agent implemented with neural nets would deserve rights because it has a concept of what is good and bad vs. its goals.

ericbarrett · on Sept 12, 2022

The first thing I thought of after reading this was the "mind palace" technique used since antiquity for memorizing things: https://en.m.wikipedia.org/wiki/Method_of_loci

intrasight · on Sept 12, 2022

Next headline of course will be that our brains are now mimicking parts of Transformers.

"This is your brain on Stable Diffusion"

midoridensha · on Sept 13, 2022

The Transformers: more than meets the eye

_s_a_m_ · on Sept 13, 2022

Nonsense, as always from AI research. Not even single neurons are properly understood and hardly possible to simulate. Don't even get me started about all this marketing bullshit.