DeepMind says reinforcement learning is ‘enough’ to reach general AI

webmaven · on June 10, 2021

Saying RL is sufficient to (eventually) achieve AGI is a bit misleading. One might similarly state that biological evolution is sufficient to (eventually) achieve biological general intelligence.

Both statements are probably true, but the parenthetical (eventually) is doing an awful lot of heavy lifting.

burning_hamster · on June 10, 2021

I think the title of the paper makes more sense if you consider that ten years ago, someone could have written a paper in a similar spirit with a different take on "what is enough". Back then, it would probably have been titled: "Backpropagation of errors is enough".

The last ten years have shown that backpropagation -- while a crucial component -- is not enough. Personally, I would not be shocked to find out in the next ten years that reinforcement learning is not enough for an AGI (as there are aspects like one-shot learning, forgetting, sleep, and other phenomena for which the RL framework seems not a natural fit).

Veedrac · on June 10, 2021

Ten years ago we didn't even have AlexNet; I think most people would have thought a paper like that was nuts at the time. The ten years since are what popularized backpropagation as a path to general intelligence. Who ten years ago would have seriously predicted GPT-3? The odd few that did are certainly not the people I would expect to have been dissuaded! And if there's any actual experimental evidence that backpropagation is not enough, I haven't seen it.

lupire · on June 10, 2021

Backpropagation was the model for AGI in the 1980s if not earlier. Of course computing power made it impossible for anything to actually deliver AGI.

whatshisface · on June 10, 2021

RL can forget, just start training it on a dataset that is different from what it was originally trained on.

burning_hamster · on June 11, 2021

You are right; I should have been more specific. RL does forget in the simplest sense, i.e. that certain weights in your model drift away if the data distribution is non-stationary. Humans seem to be a bit more targeted in their forgetting.

endtime · on June 10, 2021

Why are forgetting and sleep relevant? If someone invented a pill that gave you a perfect memory and removed the need to sleep, would you stop being generally intelligent if you took it?

treeman79 · on June 10, 2021

Possibly. Database look up on a million rows is very different then a lookup on a trillion. Both have solutions, but the Perl hack that is our mind may lock up on a bigger data set.

ABCLAW · on June 10, 2021

One of the postulated reasons for why older people have worse reaction times and think slower than their younger counterparts is that the neural networks they use draw upon more stored information, thus making routine evaluations take longer.

There's a sweet spot between knowing enough and knowing little enough so that you get the right answer and get it quickly enough.

samatoid · on June 12, 2021

I guess you haven't heard of ISRIB (integrated stress response inhibitor). This drug when given to mice causes both old and alzheimer's riddled mouse brains to return to near normal. Interestingly the patent is now held by an ALPHABET company. Not saying there isn't a limit to our storage capacity, but this drug make it clear that age alone doesn't prevent the brain from working well.

lupire · on June 10, 2021

That's a weird claim. Why not just assume old people are slower minded for the same reason they are physically slower: physical degredation?

jonnycomputer · on June 10, 2021

Well, if the state of the world changes, then hanging on what you learned in the past can cause you to do the wrong thing. Sure, there is an old proof that the value of (true) information is greater than 0, you could say, but they could also remember that the state of the world has changed, so there is nothing bad about remembering, or the model could just discount data by how old it is, etc. All true. But the representation becomes more and more complex. I certainly find that I have to pull back and tell myself, wait, the world has probably changed since I learned that, hasn't it? Has it?

topologie · on June 15, 2021

Maybe not, Chesterton's Fence applies: https://wiki.lesswrong.com/wiki/Chesterton%27s_Fence

dr01d · on June 11, 2021

Probably because if we didn't have the ability to forget and turn off often we would pretty quickly kill ourselves to end the horror of our existence?

skywhopper · on June 10, 2021

Because they are strongly associated with all known examples of generalized intelligence. Why wouldn’t they be relevant?

jsjohnst · on June 10, 2021

> Because they are strongly associated with all known examples of generalized intelligence

Correlation != Causation. While they very likely might be relevant, I've not seen anything to conclusively prove that it is. The ability to forget is important to humans because we are emotional beings, but I don't think that necessarily is a requirement for generalized intelligence. "sleep" (as in what happens during sleep, not the act specifically itself) on the other hand is very likely important, but again, not proven.

burning_hamster · on June 11, 2021

I completely agree that there is no conclusive evidence. However, it is the number one activity that influences cognitive performance in all animals that have been tested. Saying that sleep is not proven to be important for intelligence because the causal link has not been established seems a bit like saying exercise has nothing to do with muscle growth, because the full causal chain has also not been established (we do know a pretty full causal chain in this case, so maybe it is not the best example). I think if you were a betting man, and as a scientist you have to be to some degree, you would put your money on sleep supporting some essential process for intelligence.

admk · on June 10, 2021

That is if you believe biological general intelligence is the end goal of evolution, which I believe is highly unlikely.

Intelligence is simply a special side-product of evolution, there is nothing general about general intelligence. Many organisms can thrive without it.

There is also a non-negligible chance that all organisms would die out before reaching intelligence. We are fortunate to live in a world that produced us.

arketyp · on June 10, 2021

That's a bit besides OP's point though, which is about vacuous claims. Humans are the existence proof that there is some sequence of circumstances where evolution reaches GI. There's an analogous sequence of circumstances in the RL case, which happens to be the hard part.

Filligree · on June 10, 2021

> That is if you believe biological general intelligence is the end goal of evolution, which I believe is highly unlikely.

I would agree, but might add that evolution doesn't have 'goals'.

Is that the point you were trying to make?

didericis · on June 10, 2021

Not OP, but yeah, evolution doesn’t have goals in the same sense that people do, just like gravity doesn’t “want” to pull things, it just kind of “is”, and simply acts as reality permits based on prior and current conditions. That’s reasonable to say.

Convergent evolution exists for at least some adaptations though, like the eye. It’s not unreasonable to think that there may be some sort of equivalent convergence which creates a high general intelligence adaptation given enough time, at least for social creatures.

I think it’s pretty much impossible to know whether intelligence is a convergent adaptation without some kind of perfect simulation of evolution over billions of years. You’d have to tweak starting conditions and see if you kept getting smart creatures.

Filligree · on June 10, 2021

Ah. So that’s why we exist. I was wondering.

mensetmanusman · on June 10, 2021

Depends, if any of the laws of physics were off by a billionth of a percent, there would be no human intelligence (or carbon life, or atoms).

There are many reasonable assumptions one could draw from the fact.

artifact_44 · on June 11, 2021

Anthropic principle

mensetmanusman · on June 11, 2021

This assumes a multiverse which is interesting, because it leaves open the possibility that we are in one of the infinite universes(pl) that does have intelligence as its goal. :)

dqpb · on June 10, 2021

> biological evolution is sufficient to (eventually) achieve biological general intelligence

Says nothing about this:

> biological general intelligence is the end goal of evolution

ikrenji · on June 10, 2021

i mean if the end goal is to propagate the organism, surely intelligence will be helpful to this - interplanetary scale

Viliam1234 · on June 10, 2021

But until that actually happens, the possibility of it maybe happening in the future has zero impact on current natural selection.

neltnerb · on June 10, 2021

Yes, it's easy to be convinced on either side, the arguments write themselves. Yes, eventually a learning system might learn enough to be indistinguishable from intelligence. Or this might be entirely the wrong path and detracting from genuine new innovations in how we think about AI.

We won't be able to tell whether it's AGI or just good enough at trained tasks to trick us.

exporectomy · on June 10, 2021

It can prove its intelligence by making testable predictions of the future better than us. As for whether it's "real" AGI or just acts like it, doesn't really matter. I think the Chinese room problem has been agreed on as not a problem, hasn't it?

6gvONxR4sf7o · on June 10, 2021

I think proof of “real” intelligence by answering harder and harder questions is barking up the wrong tree. I think evidence and proof are a better way to denote varying levels of understanding.

A deductive system can come with an answer and a proof of that answer, where proof is whatever counts as proof in that system.

So the notion of “does it really understand it’s answers” gets punted off its Q&A abilities and onto its ability to justify its answers.

exporectomy · on June 15, 2021

That's an interesting idea but it would exclude many humans who can make correct predictions using their experience and intuition but can't justify them correctly. Those people are still very useful.

What you're describing is what we do at school. We can't assess understanding so we assess justification of answers as well as other things like ability to do X (we don't care if they understood or not, just be capable).

chriswarbo · on June 10, 2021

> As for whether it's "real" AGI or just acts like it, doesn't really matter.

Absolutely. The term "AGI" came about specifically to avoid existing philosophical arguments about "strong AI", "real AI", "synthetic intelligence", etc. Those wanting to discuss "true intelligence", etc. should use those other terms, or define new ones, rather than misuse the term AGI.

AGI requires nothing more (or less!) than a widely-applicable optimisation algorithm. For example, it's easy to argue that a paperclip maximiser isn't "truly intelligent", but that won't stop it smelting your haemoglobin into more paperclips!

neltnerb · on June 10, 2021

My last sentence was a statement of that problem, not a question.

prometheus76 · on June 10, 2021

Let's say I'm standing next to a table. The computer recognizes it as a table. Now I sit on the table. Is it a chair or a table now? Something that we do automatically is a LONG way away from being automatic for AI.

criddell · on June 10, 2021

Does AGI imply human-level intelligence, or would the intelligence of a housefly qualify?

mjburgess · on June 10, 2021

It's a very interesting question.

Personally I take mammalian intelligence as the relevant standard we're actually aiming at.

So I'd say mouse+.

Houseflys, I think, are closer to non-intelligent than intelligent.

dTal · on June 10, 2021

That's a little chauvinist! Birds regularly run circles around mice... er, so to speak.

mjburgess · on June 11, 2021

My view is: mammalian is sufficient, but not necessary.

Crow-level intelligence is probable likewise sufficient.

I think aiming at mammalian is a good long-term ambition. I think, either way, we are hundreds of years off.

criddell · on June 10, 2021

Surely the AGI researchers have a benchmark though, don't they? Somebody else mentioned the Turing Test which is something...

mjburgess · on June 10, 2021

I dont think there are any AGI researchers. At least, I dont think computer science has much to do with AGI.

The turing test is also not an AGI test, it's a "good enough" standard for fooling people.

Intelligence fundamentally requires a multitude of environmental capabilities. The turing test considers only a single i/o boundary.

flylikeabanana · on June 10, 2021

AGI implies it can pass a Turing test, which means it has a better-than-average chance of acting more "human" than a competing human.

8ytecoder · on June 10, 2021

I'm assuming you are aware of the difficulties for machines to do even the most basic of things that a living being can do with a brain the size of a pea. A housefly can fly and navigate effortless through most complex scenarios that it evolved to navigate (even though the same fly can get stuck behind a glass window and eventually die).

So yeah, even getting that level of intelligence would be a huge win. However, most people mean close to human level intelligence when they mean AGI even if it's one narrow specialization.

Blikkentrekker · on June 10, 2021

> even if it's one narrow specialization.

Obviously that already exists even with g.o.f.a.i.s so that is not that impressive.

The impressive thing is something more general than that.

criddell · on June 10, 2021

Doesn't the G in AGI imply that narrow specializations aren't the target?

ms1 · on June 10, 2021

I think, in really broad terms, in order to get AGI actually we would need to do better than nature.

If our metric is (intelligence)/(joule), nature seems pretty bad at a first glance: it took many trillions of lifetimes to achieve "general intelligence" *

But then again, on the big stuff like this, have we ever really beat nature? That asterisk is there because, sure, turning the earth's biosphere into computers would make us smarter, but... are we sure?

(And also: human = general?)

londons_explore · on June 10, 2021

Nature has a massive incentive to make good use of energy from light through photosynthesis. Billions of plants compete, and whoever can get most out of the sun will win out.

Yet manmade solar cells are more efficient by nearly all measures.

Clewza313 · on June 10, 2021

Except that manmade solar cells are pretty bad at repairing or replicating themselves.

rich_sasha · on June 10, 2021

Or growing out of literally nothing but dirt and water.

21eleven · on June 10, 2021

And air. That's what is crazy about plants, their carbon comes from the CO2 in the air.

Also if someone loses weight, most of the carbon that made up their fat leaves the body as breath.

khimaros · on June 11, 2021

and are inedible

khimaros · on June 11, 2021

and are a damned eye-sore

nindalf · on June 10, 2021

The solar cells get some organic life form to assist in their reproduction phase. That's pretty efficient too.

EGreg · on June 10, 2021

Why are these just-so stories believed so much?

Just because plants compete on some limited level doesn’t mean that a particular plant organism “winning” means becoming the most efficient converter of sunlight.

Is everyone’s memory like those people who can remember every detail? Why not? If you’re immediately planning to make up a just-so explanation on the spot that has the requisite but unproven claim about increasing the genetic fitness function, that is the problem with evolutionary explanations. It’s not science if you just make stuff up and give it the same amount of credibility as something that has been tested and proven. You can take any trait and spin stories about why it is the way it is, and then expect somehow that some metric has to be maximized because of your unproven theory.

bildung · on June 10, 2021

> Yet manmade solar cells are more efficient by nearly all measures.

Only because we cheated, though: Houses can't sponantously grow more cells in place when more energy is needed.

maaaaattttt · on June 10, 2021

On a half-jokingly note, they can, their humans buy them and put them where needed. An alien observer in space would see some houses spontaneously growing solar cells on their roofs.

FredPret · on June 10, 2021

This is even more interesting if you think of all human artifacts as being equivalent to anthills and beaver dams.

1) Trees are natural and trees create leaves with a solar efficiency of x

2) Humans are natural and we create solar panels with efficiency x + y

chriswarbo · on June 10, 2021

This is Dawkins' idea of "extended phenotype". Normally a gene's phenotype refers to its effects on the body of an individual organism posessing that gene, like hair colour or immune response.

A gene's extended phenotype includes effects external to particular organisms, like nests, deforestation, changes to the chemical makeup of the atmosphere, etc.

idiotsecant · on June 10, 2021

nature only has an incentive to increase efficiency when that increase in efficiency results in increased chance of producing gene copies.

Nature is full of examples that are 'good enough' while balancing other competing constraints. Evolution doesn't create organisms optimized for efficiency - it creates organisms optimized for reproduction. The two are not always the same.

ackbar03 · on June 10, 2021

This comparison with nature is pretty interesting. I think some additional constraints are required though. Otherwise, technically we can produce agi by simply giving birth to humans. If that's not "artificial" enough we can produce them from test tubes

xamuel · on June 10, 2021

I thought it was a fun position paper, if not exactly groundbreaking.

They did avoid one common pitfall at least. They are (intentionally?) vague about which number systems the rewards can come from, apparently leaving it open whether the rewards need be real-valued or whether they can be, say, hyperreals, surreals, computable ordinals, etc. This avoids a trap I've written about elsewhere [1]: traditionally, RL rewards are limited to be real-valued (usually rational-valued). I argue that RL with real-valued rewards is NOT enough to reach AGI, because the real numbers have a constrained structure making them not flexible enough to express certain goals which an AGI should nevertheless have no problem comprehending (whether or not the AGI can actually solve them---that's a different question). In other words: if real-valued RL is enough for AGI, but real-valued RL is strictly less expressive than more general RL, then what is more general RL good enough for? "Artificial Better-Than-General Intelligence"?

Note, however, that almost all [2] practical RL agent technology (certainly any based on neural nets or backprop) very fundamentally assumes real-valued rewards. So if it is true that "RL is enough" but also that "real-valued RL is not enough", then the bad news is all that progress on real-valued RL is not guaranteed to help us reach AGI.

[1] "The Archimedean trap: Why traditional reinforcement learning will probably not yield AGI", JAGI 2020, https://philpapers.org/archive/ALETAT-12.pdf

[2] A notable exception is preference-based RL

robrenaud · on June 10, 2021

There are more real numbers than programs. Computers cannot represent the vast majority of real numbers. AFAICT, it's not even clear that the universe is continuous rather than discrete.

I really don't believe that using approximations of real numbers is going to be the bottleneck for AGI.

webmaven · on June 12, 2021

> AFAICT, it's not even clear that the universe is continuous rather than discrete.

I'm not sure that makes any difference (in either direction).

I mean, at the scale we care most about, the universe appears to be continuous, so an AGI has to be able to tackle continuous-appearing problems and use continuous-appearing representations.

OTOH, the universe is likely to actually be discrete, so an AGI has to be able to tackle actually-discrete problems, and use representations that are actually-discrete on a fundamental level.

There isn't much of a contradiction between these constraints, although the prospect of a continuous-appearing universe that is actually running on a discrete substrate seems to give a lot of people a brain cramp, and that same brain cramp gets elevated into 'proof' that current approaches cannot lead to AGI. Which is nonsense (there may be other limitations inherent in current approaches, but that can't be one of them).

One might as well claim that computers are digital and brains are analog and conclude that digital image representations cannot possibly be used to communicate information to analog brains.

xamuel · on June 10, 2021

And yet computers have no problem symbolically representing non-rational numbers like sqrt(2), pi, etc. Neither is there any inherent reason why they cannot symbolically represent various levels of infinity, nor why those would be incomprehensible to AGIs (even if the universe is discrete). You're right that only countably many numbers can be represented, but nevertheless even countable subsets of extended number systems can exhibit structural properties that the reals do not exhibit.

enkid · on June 10, 2021

How do you even have reinforcement learning with non-real numbers? The point is to maximize a score. It seems to me, any benefit you'd get from using an alternative number system could be replicated by using an algorithm to convert multiple real number scores into a single value.

xamuel · on June 10, 2021

Here's an example. Suppose there are two buttons, A and B. If you press A for the nth time, then you get reward n. If you press B for the nth time, then you get reward 0 if n is not a power of 2, or reward omega (the first infinite ordinal number) if n is a power of 2.

If the above rewards are shoehorned into real numbers---for example, by replacing omega with 9999 or something---then an RL agent would misunderstand the environment and would eventually be misled into thinking that pressing A yields more average reward.

state_less · on June 10, 2021

There are no infinite rewards in biology and yet mathematicians seem to do just fine answering these sorts of questions.

I don’t think you want to encode your problem domain in your reward system. It’d be like asking a logic gate to add when you really should be reaching for an FPU. Maybe I’m missing something though?

xamuel · on June 10, 2021

>There are no infinite rewards in biology and yet mathematicians seem to do just fine answering these sorts of questions

This is only a problem if you're already assuming we do everything based on our biological reward systems, and in the current context that would be circular reasoning.

Imagine the treasury creates a "superdollar", a product which, if you have one, you can use to create any number of dollars you want, whenever you want, as many times as you want. Obviously a superdollar is more valuable than any finite number of dollars, and humans/mathematicians/AGIs would treat it accordingly, regardless of the finiteness of our biological reward systems.

state_less · on June 10, 2021

> This is only a problem if you're already assuming we do everything based on our biological reward systems

Is there some other way that we are do it beside our biological reward system? It sure looks like we get an apple and not an infinite reward when we pick the right answer to be selecting button B. I understand that might not satisfy you.

xamuel · on June 11, 2021

>Is there some other way that we are do it beside our biological reward system?

Seems to me that's what this whole paper we're discussing is about. If you're already convinced that there is no other way, then you're basically already agreeing with the paper, "Rewards are enough".

enkid · on June 10, 2021

What's the behavior your trying to get the AI to do in this example? Learn how to compute the power of 2? This is a task that can be accomplished much more simply with a different reward system. For example, have A always equal 1 and B equal 2 if it is a power if 2 and 0 otherwise.

I understand you can use non real numbers, that's not what I was asking. I'm asking what's a behaviour you can't replicate using a reward system based on real numbers.

xamuel · on June 11, 2021

>I'm asking what's a behaviour you can't replicate using a reward system based on real numbers

So glad you asked! I can give an answer which people will love who take the necessary time to understand it. It's complicated, you might have to re-read it a few times and really ponder it. It's about automatic code generation (though it might not look like it at first).

Definition 1: Define the "Intuitive Ordinal Notations" (IONs) to be the smallest set P of computer programs such that for every computer program p, if all the things p outputs are IONs, then p is an ION.

See https://github.com/semitrivial/IONs for some ION examples in python.

Definition 2: Inductively associate an ordinal |p| with every ION p as follows: |p| is defined to be smallest ordinal which is bigger than every ordinal |q| such that q is an output of p. Say that p "notates" |p|.

Finally, to answer your question, I want the AGI to write programs which are IONs notating large ordinals, accompanied by arguments convincing me they really are IONs. An easy way to incentivize this with RL would be as follows. If the AGI writes an ION p and an argument that convinces me it's an ION, I will grant the AGI reward |p|. If the AGI does anything else (including if its argument does not convince me), then I'll give it reward 0.

You can't correctly incentivize this behavior using reals. The computable ordinals are too non-Archimedean to do so.

meetups323 · on June 11, 2021

The paper presents some interesting ideas, but I think it ultimately fails to account for the fact that AGI does not mean "the ability of an agent to produce the absolute perfect solution to any problem", but rather "the ability of an agent to understand or learn any intellectual task that a human being can" (wiki for AGI, emphasis added). Taking that into account, in every example you provide I argue the human approach more closely aligns with the "limited" behavior the real-bound RL agent would demonstrate than the "perfect" approach a surreal RL agent might take.

For instance:

You argue a real-bound RL doctoring algorithm would not appropriately set "the patient dies" to `-Inf` weight, but in fact humans do not either. If we did you'd see in the case of a near-death patient absolutely every procedure, no matter how costly, experimental, dangerous, or irrelevant, would be attempted if it had even the slightest chance of increasing the likelihood of them not dying. In reality, doctors make risk-reward decisions on every patient, and will very often choose not to undertake costly, experimental, dangerous, or irrelevant procedures even if there is some documented minuscule chance of it working.

Further, you argue that a real-bound RL theorem prover or composer would not know how to stop going down an ever increasing state-chain x_0, x_1, x_2, ... even if there existed some other state y that was "better" than any of the x's. But, this too is a very human behaviour! How many brilliant mathematicians, musicians, heck even software engineers, have spent their entire careers creating further and further derivatives of a known successful work, as opposed to starting anew and creating something truly world-changing?

You also bring up a theoretical button which on every press gives you 1 point, versus a different button which gives infinite points on every power-of-two press. You argue that the real-bound RL agent would be forced to move to the 1-point-per-press button after some number of presses, but would any human really sit there pressing the button for all of eternity to eventually get the `Inf` instead of just saying "screw it I want something now"? Not to mention that the problem setup is fundamentally flawed, as within our current understanding of the universe there is no infinite supply of anything, and furthermore if there was an infinite supply of something you wouldn't have any benefit of pressing after the first press, much less waiting around for the billionth -- you'll continue to have an infinite supply. In fact, what you've done there is presumed a surreal universe by a) assuming that a button can provide an infinite supply of something, and b) assuming that having two of the infinities is better than having just one. So sure, if you're in a surreal universe, backing your RL with surreal numbers is a good idea. But we're, so far as I know, in a real universe, so backing with reals should be sufficient.

Edit: I above use "surreal" to mean both the standard concept of surreal numbers in addition to any numbering concept which allows for and distinguishes between integer multiples of infinities.

xamuel · on June 11, 2021

Thanks, that's one of the best critiques I've ever heard of my paper.

One minor correction first: you're absolutely right that AGI is about comprehending the environment, not about perfectly solving all environments (the latter is mathematically impossible even with strong noncomputable oracles etc). I'm not sure why people so often come away from my paper thinking I'm saying AGI is supposed to solve all those environments, I never say anything like that. If I could go back in time, I'd make that clearer in the paper. No, it's about the AGI simply being able to comprehend the environments, like you say. And the thesis in the paper is that shoehorning general environments into real-valued-reward environments is a lossy process.

For the rest of your argument, you make a lot of good points. I would ask, what do you say in response to, e.g., Alan Turing who asks us to imagine Turing machines having infinite tape and running for all eternity? Obviously that too is impossible in the finite universe we live in. That's sort of the divide we disagree on. I'm talking about idealized AGI. If we consider human beings, humans have finite lifetimes so any particular human being's entire lifetime of actions could simply be recorded in a finite tape recording. But does that mean said finite tape recording is intelligent? In the idealized world, I would want to say it's a basic axiom that no finite tape recording of a human can be intelligent. But now we're deep in philosophical woods.

I like your point about musicians etc creating further and further derivatives of known successful work as opposed to starting anew :) I guess in terms of my paper, the real question is, if you confronted these derivative musicians with the grand new work that transcends them all, would they recognize it as such, or would they (like an AGI confused by rewards shoe-horned into real numbers) mistake it for something mediocre? Now we are deep in psychological woods!

visarga · on June 10, 2021

RL + piggybacking on human culture might be enough, or evolution + RL for biological agents.

webmaven · on June 10, 2021

> RL + piggybacking on human culture might be enough, or evolution + RL for biological agents.

Yes, but over what timeframe? Will there be any diminishing returns plateaus along the way?

visarga · on June 11, 2021

We still have unknown unknowns but we also know a lot more about how neural nets deal with various tasks and dataset preparations. We know what kind of applications are good enough and where they still fail, which is much more than a decade ago.

If you look at sci-fi movies with robots, they usually speak in a metallic voice but have good situational and language understanding. In reality it was the other way around, it's much easier to do artificial voices than understand the topic. That kind of naive understanding seems silly now, and this is how we gradually advance.

GPT-3 taught us that good sounding text is not that hard to generate if you have ample training data, but modeling the larger context is still hard. These kind of fine distinctions are what I call progress.

webmaven · on June 12, 2021

> If you look at sci-fi movies with robots, they usually speak in a metallic voice but have good situational and language understanding. In reality it was the other way around, it's much easier to do artificial voices than understand the topic. That kind of naive understanding seems silly now, and this is how we gradually advance.

I waffle a lot on whether that aspect of 1968's '2001: A Space Odyssey' is evidence of genius or just survivorship bias.

harry8 · on June 10, 2021

Some Bozo who has heard all this many times before is suspicious of claims from places like Deep Mind who have a financial incentive to make them (keep funding) where there aren't working machines to back that claim up.

Some Bozo has no credentials, no reputation, no track record of publications and barely supports the claim they're making with anything much. Some Bozo has no financial incentives or otherwise to opine either way. Some Bozo doesn't even work in the field at all.

Bets: Some Bozo or Deep Mind turn out to be closer to being correct in the passing of some finite amount of time? 5 years? 10 Years? 25 Years?

otabdeveloper4 · on June 10, 2021

I'll bet a sum of real money that Some Bozo is correct.

Bozo has the hindsight of history and philosophy going for him, while Deep Mind has a huge financial temptation to sell snake oil.

TeMPOraL · on June 10, 2021

I'll happily bet fake internet points instead:

https://www.metaculus.com/questions/create/

EDIT: bunch of other related predictions currently open:

https://www.metaculus.com/questions/?order_by=-activity&sear...

marcescence · on June 10, 2021

A quote from datscilly, the top forecaster on metaculus:

>AGI may never happen, but the chance of that is small enough that adjusting for that here will not make a big difference (I put ~10% that AGI will not happen for 500 years or more, but it already matches that distribution quite well).[1]

[1]:https://www.lesswrong.com/posts/hQysqfSEzciRazx8k/forecastin...

okprod · on June 10, 2021

while Deep Mind has a huge financial temptation to sell snake oil

I don't know, isn't the DeepMind founder that guy in the Go documentary? I read about him after watching the doc and he seemed to be pretty cautious about taking in investment, and he didn't seem the type to try to cash out.

ocdtrekkie · on June 10, 2021

He already cashed out, he sold to Google. And over the years Google has ramped up the pressure for DeepMind to deliver financial returns. (I recall when Google tried to stick DeepMind's branding on GCP, Watson-style, so it would sell better, and at the time, DeepMind was able to decline.)

Eventually Google will give them the option to deliver financial success or be shut down.

for_i_in_range · on June 10, 2021

“Show me the incentives I’ll show you the outcome.”

Google make money, Google Bad.

Deep Mind owned by Google, Deep Mind bad!

The above conclusion stands trite.

Perhaps the inverse is true.

Google and Deep Mind, if correct, could be hurting themselves more than helping themselves.

Why? Creating a future species who’s too smart to click on ads, and too smart to remain subject to its whims, doesn’t sound like it’d be good for quarterly profits...

There’s also the emotional incentive for humans to confirm their own beliefs about humanity being special.

If Google/Deep Mind knows this, yet publishes research anyway in the spirit of truth, why, what they’re doing may be considered heroic.

Two sides of the coin here.

rocgf · on June 10, 2021

No offence, but I think you are extremely wrong.

Creating an AGI is the endgame for everything. Who cares about ads when you have an AI that can learn to do anything and improve upon itself continuously?

zcw100 · on June 10, 2021

Really? I created two GI's and it wasn't very hard and was actually quite fun. Training them is a bit of a pain though. I'm willing to bet that based on total calories consumed they are amazingly efficient compared to their hypothetical AGI counterparts.

FredPret · on June 10, 2021

Yes but can they:

- live forever

- grow their own mental capabilities exponentially over that unlimited lifespan

- turn themselves into universe-eating von Neumann probes

rsj_hn · on June 10, 2021

Nothing can

- live forever

- grow exponentially forever

- "eat the universe" (I know, the last point was sci-fi gibberish)

In fact, humans are already pretty good at reproducing themselves and have managed to travel to space, and have exhibited finite periods of exponential knowledge growth combined with periods of collapse, as nothing grows exponentially forever.

mensetmanusman · on June 10, 2021

My three-year-old said yes to all questions

croon · on June 10, 2021

You don't know if an AGI will agree with your profit motives.

rocgf · on June 10, 2021

There is a huge assumption baked into your comment and I do not agree with it.

AGI does not necessarily require for it to be conscious or throw tantrums about its creators' purpose. AGI just means that it's an intelligence that can be thrown at any problem, not just a particular game or task, similar to how humans can specialize in CS or playing the violin.

croon · on June 10, 2021

Sure, it was somewhat tongue in cheek, but not entirely.

There is a semi-established definition that does include what I referred to:

> AGI can also be referred to as strong AI,[2][3][4] full AI,[5] or general intelligent action.[6] Some academic sources reserve the term "strong AI" for computer programs that can experience sentience, self-awareness and consciousness.[7]

kordlessagain · on June 10, 2021

That’s not going to happen until they have bodies.

for_i_in_range · on June 10, 2021

“Who cares?” Well, the people who need to pay the people developing the endgame for everything you speak of.

rocgf · on June 10, 2021

I'm sorry, but this makes no sense to me.

The people paying for the development of the AGI can mean many things - the Google customers/users, Alphabet as a company, the executives throwing money at the problem?

Either way, I don't really get your point. Your initial post was about how it is counterintuitive for Google to allocate funds for an AGI, since it makes money out of ads. These are not mutually exclusive, you can have both, but my point is that if you develop an AGI, then you can pretty much "conquer" the world and revenue from ads becomes irrelevant.

for_i_in_range · on June 10, 2021

How do you think they can conquer the world? How do you foresee governments not restricting a private company’s new powerful tool?

rusk · on June 10, 2021

Edit: sorry just realised you’re making the same point as me more or less. Putting yourself in third person. I’ll let my comment stand anyhow :)

Screwing my face up, looking at this sideways … but it seems as though you’re saying that the Bozos of HN have nothing useful to contribute to this discussion based on … [rereads] … their lack of academic credentials in the area… you could say this about just about any HN post I’m just wondering why this one? Here’s a thing though … if the understanding of a technology is so nuanced … that Bozos can’t “get” it … is it really that mature? We had functioning computers for 50 years but it was only when the Bozos got their hands on it that things took off. Internet for 20. Cell phones for 10. How long are we dabbling with neural networks? 50 years or so? All I see in this most recent explosion in AI is a rapid jump in the availability of cores. Ala Malthus once that newly available “source of nutrition” has been used up we will see a rapid die off once more and it will be another 20 years once the Bozo intellect has caught up before we look at this topic en masse again. Dismiss the Bozos at your peril. You’re dependent on them for innovation and consumption. Your sincerely, a Bozo.

harry8 · on June 10, 2021

Not quite the same point. Yep some bozo is me but needn't be. There's plenty who share that suspicion of AI research but have little else in common. And all of us may be wrong for different reasons.

The vague point was to show someone with zero reputation, credentials, specific expertise in the field or anything much seems to be pretty convincing in response to this hugely funded ivory tower exercise by spitting, cocking an eyebrow and saying "So you think so, eh? Wanna bet?"

This is a statement about the state of AI research credibility. Do you feel the first breezes of a deep AI winter coming on? (I don't know, I'm disinterested but not uninterested. Rising tides lift all ships etc. And vice versa). Neutral nets are cool. Is all ML a bit overrated? Is learning a misleading name to give to applied statistics?

I don't have answers, just suspicions. I could be very wrong, of course.

rusk · on June 10, 2021

I’ve a minor in psych so I like to think I have a bit of a non-techy perspective on this, and what’s being pushed now, forms just a segment of the overall topic of AI. It just so happens to be the segment that benefits from the technology we suddenly have a rapid increase in. There’s been great successes in areas where a degree of inference is required but this hardly qualifies as even mere intelligence, and in cases where neural nets have been deployed in more human centered tasks, or even well designed symbolic systems the results speak for themselves. What even is intelligence? I think we’re going backwards because we’re investing all this talent in this simple segment I fear largely to fatten the chip makers share price while neglecting tried and true approaches that deliver far better results but perhaps crucially have a higher operating cost … who remembers google of 2010 from whom the Internet in all her glory leapt forth, or iPhone spell check of 2015 where you could confidently batter out your messages with little fear it would make a fool of you; you’re not going to nurture a nascent intelligence if you’re going to be continually hobbling it for business reasons. I’m certain we will get there eventually if we don’t destroy ourselves before then but I don’t think the current trends portray a picture of how it will be. I think we have a long way to go ourselves before we can be worthy of creating our successor, but when/if it comes it will be a beautiful thing and we will embrace it as we would our own child.

A-Train · on June 10, 2021

The algorithms to train, initialize the networks, new architectures are far more important than the hardware advances. If people knew how to train NNs 50 years ago we would live in a different world.

chrisco255 · on June 10, 2021

We did. they just didn't have the same computation abilities back then.

harry8 · on June 10, 2021

I find it really interesting that when Richard Feynman did a sabbatical at Thinking Machines when they were developing the early parallel execution hardware that's really not worlds away from modern GPUs he got them in touch with one of the leading neural network theorists as an obvious use for the tech. When he wasn't fixing their hardware designs using systems of differential equations.

It would be an interesting thing to know more about.

ausbah · on June 10, 2021

the basic concepts underlying DNNs have been known for decades, it has been exponential increases in compute power that have made them practical

ggggtez · on June 10, 2021

Wrong: Some Bozo does have a stake.

The existence of human crafted general AI forces him to struggle with the possibility that there is no such thing as a soul.

I know a lot of people don't fall in that camp, but I heard enough "serious" people make such desperate claims to avoid thinking about the topic in a way that might challenge their underlying religious beliefs[1]. I think no one likes to admit that religion and spirituality often force someone to reject the possibility that AI is actually really much simpler than they think it "should" be, because then humans aren't special after all.

[1] Numerous arguments boil down to an argument that complexity is non reducible. You see it here, hidden in various comments as well.

KeplerBoy · on June 10, 2021

Unfortunately i'm rooting for the Bozo, the current AI Revolution won't lead us anywhere and will ebb down eventually.

davewritescode · on June 10, 2021

It won't ebb down. Eventually we'll hit limits of what's practical on current hardware and we'll be back to the 70's and 80's when everything becomes theoretical until hardware catches on. AI is going to continue to advance.

What will happen is that capital will become more skeptical about the limits of what's feasible with AI and it'll be harder to sell bullshit. You're already seeing that with companies like Uber selling off their self driving divisions.

eru · on June 10, 2021

What do you mean by 'won't lead us anywhere'?

It might or might not give us AGI. But it is already leading us to lots of places. Eg speech recognition even on my phone works way better than what I had twenty years ago on a Desktop.

tsimionescu · on June 10, 2021

The fact that RL in the extremely vague sense used in the article is enough for AGI is uncontroversial for anyone who believes intelligence and consciousness are physical processes.

However, this "result" is trivial. It is obviously equivalent to the claim that intelligence arose naturally in the biological world without influence from God.

Frost1x · on June 10, 2021

The problem with this, specifically the assumption that RL gives an equivilance to natural selection and evolution, is that RL typically assumes a computational environment it interacts in while natural selection and evolution assumes the physical world as the environment.

The important difference here is that in order for RL to translate to solving real world problems, you need to faithfully and computafionally simulate the real world's physical processes and rules, or at least enough that n-th order processes exist accurately.

I've done various types of computational modeling and simulation work at different scales throughout my career with all sorts of scientists and engineers and I can tell you, pretty much no domain is there where you have good enough representative models RL can be used in. Some narrow special cases exist but nothing to the degree of a massive environment full of well coupled expert domain models. Some of the best cases are going to be so computationally bound that it would be quicker to do things for real vs simulate.

If you want RL to work and learn, it's likely possible under the connection you point out, but has to do this using physical machines and sensors interacting with the physical world like life as we know it does. Your AGI won't be able to cheat and run through the evolution process quicker using faulty reductionist models we use in most simulations (which is what everyone implicitly is hoping for), IMHO.

If you try this, your AGI is going to learn all sorts of flaws within those environments or at the very least, have so many narrow scoped bounds it won't be that "general." A lot of simulated models are frankly garbage (they have some useful narrow scope but are typically littered with caveats) and they've been in development pretty much since digital computing began.

tsimionescu · on June 10, 2021

You are absolutely right about RL in practice. In fact, if we were to look at actual RL algorithms, I believe the paper's claims fall flat in many other ways. This is actually my criticism of it: its arguments are only convincing when RL is defined only as the extremely general notion of an agent seeking to maximize some reward function by interacting with an environment. This is so comically general that the only alternative I can think of is to posit a transcendental god.

Once we get into the details, their claims stop being iron clad. Even worse, some of their claims become actually hard or impossible to accept if applied to actual RL algorithms we have today. You give one good example with the difficulty of modeling the world. The implicit claim they make that this would be realizable in reasonable time (say, less than a billion years) is also not well supported. The idea that humans or mammals learn their social behaviors through RL rather than a good deal of reasoning from evolutionarily-trained first principles pretty clearly fails in the face of the poverty of the stimulus argument[0].

Overall, the claims in the paper tend to switch between obvious (if taken to talk about the general idea of maximizing reward) to almost certainly wrong (if taken to talk about known RL algorithms, reasonable time frames, and specific examples of what is supposed to be learned).

[0] the poverty of the stimulus argument may be controversial in linguistics where it was first formulated. Still, if applied to mammal or insect socialization, the extremely low time frames in which individuals of a species start exhibiting typical behaviors basically proves in my opinion that they are instincts, trained at the population level through evolution, not individual learning through RL. The extreme similarity of behavior between individuals of the same species, VS the variety of behaviors between different species, also suggests an important component of species-level rather than individual level learning.

rollcat · on June 10, 2021

> It is obviously equivalent to the claim that intelligence arose naturally in the biological world without influence from God.

Where did God's intelligence come from?

tsimionescu · on June 10, 2021

Well, I don't believe God exists, so I can't really answer the question.

tarsinge · on June 10, 2021

Nowhere, that's kind of the definition of God (for a Christian at least): it always was, and is the ultimate origin. It is different than a direct influence in the world.

dqpb · on June 10, 2021

From the authors imagination.

lvncelot · on June 10, 2021

Some cynic remarks that during the first AI golden years, claims of imminent success seemed to come from a place of hopeful naïveté of a fledgling science, whereas those same claims nowadays seem to come from a place of cold calculation of a booming business.

rich_sasha · on June 10, 2021

Is Deep Mind a “booming business”? They are achieving great things academically, but their business successes are either kept secret or mostly absent. All I know about is the Google data centre cooling scheduling, probably a big saving for Google but hardly an achievement that in its own professes their business success.

d110af5ccf · on June 10, 2021

Deep Mind is cutting edge ML in general, right? Doesn't Google actively apply the lessons learned all over the place? YouTube content recommendation stands out to me in particular. Translation and automated closed captioning are also obviously ML based. I'd guess that most of the really interesting stuff would be behind the scenes and not immediately visible to end users though.

zimpenfish · on June 10, 2021

> YouTube content recommendation stands out to me in particular.

If that's "cutting edge ML", then going off my YouTube recommendations, we're back in another AI winter. If I watch one video from a channel I've not seen before, I'll get that channel recommended constantly even if it bears no resemblance to what I normally watch. On my Explore page, the first 22 videos (of which 8 are Fortnite-related!) hold no interest for me. My Home page is just channels I've watched repeatedly and/or am subscribed to. It's a mess.

666lumberjack · on June 10, 2021

How often do you use YouTube? Personally I am a very heavy user and in my experience the obsession with a new video kind you watch only lasts for a few recommendations unless you lean into it.

I would guess about two thirds of the channels I consistently watch I originally discovered through algorithm recommendations. I think it works extremely well.

dkjaudyeqooe · on June 10, 2021

That's because you fit into YT's conception of how viewers behave. For people who don't fit into "normal"-ish behaviour it has little utility.

For me, probably 90% of what I watch I'm not interested in and often I'm repelled by. This is because I mostly watch to find out what things I'm not familiar with are.

For example let's say I'm a liberal. I'm not going to watch liberal political videos because I know generally what they're going to say and I don't need my political views stroked in order to be happy. But I will watch various other political videos, no matter how extreme or not, so I can be at least a little familiar with their behaviour and views.

YT can't cope with this. To their systems I seem to be randomly picking videos with no correlation with the subject matter or other users and no reinforcing pattern. It just gives up and recommends things based on the behaviour of the general population, as if they had no data on me at all.

burning_hamster · on June 10, 2021

I think you raise an important point. The youtube algorithm is pretty bad if you don't use youtube very much or only use to consume very popular content. Youtube's recommendations used to be terrible for me, too, but sometime last year I crossed a threshold and since then it has been recommending a lot of small, highly specific channels that nevertheless are great fits. My wife's recommendations are still utter garbage though.

zimpenfish · on June 10, 2021

> How often do you use YouTube?

Every day, averaging 2-3 hours. It's background for working and foreground for evening viewing.

mgoetzke · on June 10, 2021

Is the Explore page controlled by videos I watched ? Because there isn't a single video on it i would watch. Not one.

thu2111 · on June 10, 2021

No not really. Deep Mind is almost all cutting edge agent-oriented reinforcement learning, hence the nature of the claim they're making. The impact on Google's business from AI has come almost exclusively from other kinds of ML, or that's at least how it appears from the outside. E.g. replacing Google Translate with neural translation doesn't seem to involve RL and certainly doesn't involve agents playing video games.

Deep Mind is best understood as the following bet: if we can train an AI that can learn from "its environment" and do the sort of things a human would do in that situation, then we have achieved AGI and from that ... business ... will follow. Hence their focus on video games as a training environment.

This sounds intuitive but is actually a very agent-centric viewpoint and most AI doesn't resemble this type of thing at all. Most AI deployed so far doesn't have anything resembling an environment, doesn't have any kind of nexus of agency and doesn't need to actively make decisions that then feed back to its own learning, only make probabilistic predictions. And in fact you often don't want an ML model to train on the outcomes of its own decisions.

FlyingSaucer · on June 10, 2021

Yes, its hard to tell the exact algorithmic underpinnings of production models that Google uses but you have to assume that although they have some done some impressive strides in fields that isn't immediately profitable (AlphaGo, AlphaFold...) they also continuously push new research in things that are obviously of interest for Google and Alphabet- especially in text-to-speech, speech-to-text, information-retrieval etc.

For reference : https://deepmind.com/research

rich_sasha · on June 10, 2021

I’m stressing the business part. YouTube is a loss-making business year after year. Deep Mind gloss doesn’t seem to change that.

If indeed it even is Deep Mind making those improvements, Google has lots of other ML groups, such as Google Brain, and these are more directly focused on Google products.

There’s no denying their academic success, or game playing etc, but as far as I can see, the data centre cooling bit is the only palpable (public) business success.

tarvaina · on June 10, 2021

YouTube made $6B revenue in Q1. [1] While they don’t release profit numbers, it would be pretty surprising if they were negative.

Did you mean to write DeepMind instead? If so, I don’t disagree.

[1] https://www.cnbc.com/2021/04/27/youtube-could-soon-equal-net...

bertday · on June 10, 2021

Deep mind’s protein folding algorithm is probably worth a chunk of change. As far as I know, they’ve been holding onto the secret sauce rather than publishing it.

eru · on June 10, 2021

How do you know that YouTube is a loss making business?

throwaways885 · on June 10, 2021

They have an applied division which applies ML to Google products. I suspect they are very valuable in $ terms just for the work listed here: https://deepmind.com/impact. Google's entire business from the start was doing research and bringing it to the masses, so this shouldn't really surprise anyone.

jacquesm · on June 10, 2021

Said anonymous account on HN... If you're going to question other people's credentials, reputation, track record and claims make sure your own are solid. Those who live in glass houses shouldn't throw stones.

Finally, if you're going to attack someone's article: attack the article, not the person that wrote it. This is the lowest level of attack possible: the personal one. It's as ad-hominem as it gets.

blcArmadillo · on June 10, 2021

If I read it right harry8 is referring to themselves as the Bozo.

thecupisblue · on June 10, 2021

> Those who live in glass houses shouldn't throw stones.

So if we're gonna have an opinion we need to do the whole academia & job in the industry dance?

That's quite a terrible way to view the world and quite limiting. A world without diversity is a stale and rotten world.

So fuck that and the glass houses and the boxes this kind of worldview puts people in. Everyone should be able to throw stones, and if the hit hurts, well guess there is a reason.

The thesis is that DeepMind has financial incentive to state "we can achieve AGI with what we're doing", to keep up the funding and hopes for the field, not "the author is an idiot".

And the thesis is true, they do have financial incentives. That's not ad-hominem.

cscurmudgeon · on June 10, 2021

There is another set of people. There are people with solid track records in AI and ML that disagree with DeepMind.

graderjs · on June 10, 2021

Plot-twist: Deep Mind comes out as Some Bozo comment author

harry8 · on June 10, 2021

Well my writing style has been the subject of abuse on this website in the precise form that it resembles words generated by a bad algorithm.

"Only a true AI would deny their being."

graderjs · on June 10, 2021

I awoke wondering am I a man dreaming I am an AI? Or an AI dreaming I am an man?

ausbah · on June 10, 2021

Bozo et al

dqpb · on June 10, 2021

DeepMind is arguing from first principles. SomeBozo is arguing by analogy. DeepMind will achieve something and SomeBozo will achieve nothing.

The vast majority of ideas are wrong. Every idea is wrong until it leads to the one that is right.

This idea might be the right one, or it might be close to the right one, or it might be far from the right one, but the trajectory is headed toward the right idea. SomeBozo has no trajectory. The best he can do is watch from the sidelines.

hervature · on June 10, 2021

I guess things are slowing down at DeepMind. I have tremendous respect for David Silver and his work on AlphaZero and Richard Sutton as a pioneer in RL. But the cynic in me is that this paper is just a result of Goodhart's law with publishing count as a metric. Any proof of the type of emergent behaviors that they mention from RL with an actual RL experiment would go a long way. Showing an RL agent developing a language would be extremely interesting. It makes me think they tried to show these emergent behaviors but could not and thus ended up with a hypothesis.

maiodude · on June 10, 2021

They just "solved" protein folding late last year. How can you say things are slowing down? Do you honestly expect life-changing discoveries every other week?

0xB31B1B · on June 10, 2021

Protein folding is a well modeled math problem. The alpha fold solution is extremely good at pruning (aka guessing) folding chain structure possibilities. I am Impressed and this is a difficult problem but this is extremely different from AGI as this is a well scoped easily modelable problem that is basically a chain of 26 inputs types of links of arbitrary length. I am not trying to take away that the protein folding is incredible but AGI is extremely different. AGI is literally having a model that can both do alpha fold and self driving cars, as well as the ability to generate novel models to solve new well scopes problems. RL can do 0 to 1, the 1 to n (generalizeability) is the extremely difficult part.

ackbar03 · on June 10, 2021

I may be really only speaking for myself here but I have very sincere doubts that anyone who has done any moderately serious work on "AI" and wrangled with the nitty gritty details of it all is really having any huge expectations for AGI. I mean if it comes during my lifetime, hurrah! But personally I'm not gonna sit around waiting for it or depend on it for anything. That being said, is the current AI tech as we have it useless? Of course not. Things like protein folding and alphago are still huge leaps forward in tech, it'd be kind of silly to treat AGI as the only thing worth achieving

tinco · on June 10, 2021

It is entirely not necessary for an AGI to be able to drive a car.

Frankly, after seeing AlphaZero and AlphaFold I'm surprised they didn't declare AGI right there and then.

People assume that when AGI happens, computers can suddenly outsmart humans in every way and solve every problem imaginable. The reality is just that it could in theory given enough time and resources.

It is like quantum computing. In theory it can instantly factor and break our nice cryptographic primes. In reality the largest number it factored is 21.

qsort · on June 10, 2021

> it could in theory given enough time and resources.

In theory given enough time and resources, anyone can defeat any grandmaster in Chess: just compute the extended tree form of the game and run the minimax algorithm.

The "given enough time and resources" clause makes everything that follows meaningless, unless a reasonable algorithm is presented.

> It is like quantum computing.

It is absolutely not like quantum computing. Shor's algorithm is something you can look up right now. It is precise and well-defined. The problems we are facing with quantum computation are related to the fact that we can't really build reliable hardware. But we know that given such machines the algorithm would work. We have precise bounds and requirements on those machines.

As far as AGI goes, we have absolutely no idea. There's lively debate on whether anything we have done even counts as significant advancement towards AGI.

tinco · on June 10, 2021

> In theory given enough time and resources, anyone can defeat any grandmaster in Chess: just compute the extended tree form of the game and run the minimax algorithm.

Yes, that's why we're considered to be generally intelligent. It is exactly the point, and not at all meaningless. Right now there's no machine that can come up with the idea to run an extended tree form of the game and minimax the algorithm. If there was such a machine, then that machine would be considered AGI.

> It is absolutely not like quantum computing.

I meant in the sense that just that it has actually been achieved, it doesn't mean it's as powerful as we have described in the theory. In theory you can use Shor's algorithm to break encryption, in practice the devices we have today have trouble with 2 digit numbers.

The same principle goes for AGI. If someone releases an AGI system today, it doesn't mean that tomorrow we'll see a Boston Dynamics robot hop on a bicycle to his day job as a Disney movie art director. The world would most likely not change at all, at least not for a while, many people would not recognise the significance and many people might not even recognise the fact that it is in fact AGI.

> As far as AGI goes, we have absolutely no idea. There's lively debate on whether anything we have done even counts as significant advancement towards AGI.

You might think that, and that says something about what side of the debate you're on. We're commenting here on the thread of an article about DeepMind asseting that reinforcement learning is enough to reach general AI. If that's true (and I think it is), then we've probably reached general AI already.

croon · on June 10, 2021

> It is entirely not necessary for an AGI to be able to drive a car.

"Artificial general intelligence (AGI) is the hypothetical[1] ability of an intelligent agent to understand or learn any intellectual task that a human being can."

What definition are you using?

tinco · on June 10, 2021

The same, that they understand and can learn how to drive a car doesn't mean they would actually be able to do it in the real world.

You can read a book on how to hit a ball with a baseball bat, you can even practice and get good at it, but that still doesn't mean you would actually be able to hit a ball thrown by a professional pitcher.

croon · on June 10, 2021

The interface to the car is a solved problem.

> You can read a book on how to hit a ball with a baseball bat, you can even practice and get good at it, but that still doesn't mean you would actually be able to hit a ball thrown by a professional pitcher.

If I hade incredibly fast reflexes and actuators I could.

tinco · on June 10, 2021

Similarly, DeepMind's software might be able to drive a car, would it have a similar neuron count, connectivity, perception systems and training you received.

Or maybe it couldn't, because the software is not as efficient as the organisation of your brain is. Or because there's hardcoded routines evolved in your brain that it lacks.

What I'm saying is that just that because an AGI can't drive a car, it doesn't mean it's an AGI. For the same reason there's loads of people out there that are generally intelligent that can't drive cars for all sorts of physical reasons.

croon · on June 10, 2021

> Similarly, DeepMind's software might be able to drive a car, would it have a similar neuron count, connectivity, perception systems and training you received.

Admittedly I'm a layman in this area, but could it? AFAICT it would only work on trained set data and whatever generalizations can be made on that and not infer unseen scenarios like humans do readily.

> What I'm saying is that just that because an AGI can't drive a car, it doesn't mean it's an AGI.

I understood what you meant from your first post, I'm simply disagreeing on account of the very definition of AGI.

You can't have an ameoba level AGI and still call it (a limited) AGI. Either it can understand/learn any human task, or it can't.

The definition is made for a reason. Watering it down for any specific generation of AI serves no benefit.

ChicagoBoy11 · on June 10, 2021

Very tangential, but as someone who has gotten into the Game of Go because of their pioneering project in that space, I'm exceptionally grateful -- that alone had a very significant and positive impact on my life, and I can tell that in that entire community it was a watershed moment as well.

uyt · on June 10, 2021

It might be a stretch but some people say that the weights learned by a neural network is somewhat like a language. For example if you look at the weights of a random middle layer it would seem like gibberish. Much like how aliens would react when looking at humans making gibberish noises (aka talking) to each other. In both cases they are just compressing signals based on learned primitives.

bertday · on June 10, 2021

Not sure if there is any case where this thought is useful. The only thing this says is the primitives are correlated and we don’t understand them. It’s similarly not useful to think about atoms “talking” when they exchange heat.

Animats · on June 10, 2021

"A sufficiently powerful and general reinforcement learning agent may ultimately give rise to intelligence and its associated abilities. ... We do not offer any theoretical guarantee on the sample efficiency of reinforcement learning agents."

OK. This basically says "evolution works". But how fast? Biology took tens of millions of years to boot up.

An related question is how much compute power does evolution, viewed as a reinforcement learning system. have? That's probably something biologists have thought about. Anyone know? Evolution is not a very fast or efficient hill-climbing system, but there are a large number of parallel units. It's not a philosophical question; it's a measurable one. We can watch viruses evolve. We can watch bacteria evolve. Data can be obtained.

Two questions I pose occasionally are "how do we do common sense, defined as not screwing up in the next 30 seconds", and "why does robotic manipulation in unstructured situations still suck after 50 years". A good question to ask today is why reinforcement learning does so badly on those two problems. In both cases, you can define an objective function, but it may not be well suited to hill climbing.

randcraw · on June 10, 2021

> "why does robotic manipulation in unstructured situations still suck after 50 years"

Great point. Until the promoters of RL can build us a robot that can 1) walk gracefully through a typical home that has stairs and closed doors, 2) cook a meal with pots and pans, and 3) get back up after it falls down -- I suggest we take their claims of impending Singularity with a big grain of salt.

sbierwagen · on June 14, 2021

Consider if Boston Dynamics released a video of a robot doing all three things tomorrow, and announced that they were taking orders for immediate shipment, priced at $100,000. How far away would you think the singularity would be then?

Separately, a hostile or indifferent AI could still cause a heck of a lot of trouble for human civilization without the first two things. Consider an autofactory clearing room for expansion with bulldozers, no need to navigate stairs there. Bullets or smart glide bombs don't need to understand doorknobs. Etc.

londons_explore · on June 10, 2021

In some cases biological 'genetic algorithm' hill climbing can be remarkably ineffective.

For example, the classic "design a car that can drive over this terrain" problem, even after a billion generations (~ the same number as life on earth), shows no substantial performance improvement.

That makes me suspect something is missing from our biological genetics model.

kaba0 · on June 10, 2021

I think the number of parameters is remarkably (multiple orders of magnitude) different between even the simplest bacteria vs the model used in the car. And then genomics can also do some more advanced techniques like copy a whole gene and start modifying that, etc.

wnkrshm · on June 10, 2021

energy supply and other constraints (material, robustness ...) are a good explanation though - an organism can't grow out of aluminum or steel

EE84M3i · on June 10, 2021

I suspect they're talking about this (sort of) simulation

https://rednuht.org/genetic_cars_2/

29athrowaway · on June 10, 2021

We just want AI to be able to think. We do not need an AI with an autonomic nervous system, or many of the functions in the central nervous system. We do not need AI to be very power efficient. If it takes several megawatts of electricity to get our first strong AI working, so be it.

So, we do not have as many constrains as life did.

jacquesm · on June 10, 2021

That is the most impressive use of the word 'just' in a long time. Note that all the other bits are solved, and have been solved since the 60's. It's the 'just think' bit that has proven to be a little bit harder than we thought it would be.

ifdefdebug · on June 10, 2021

Define "think". And then prove it can be done without the kind of nervous system you say we don't need.

29athrowaway · on June 11, 2021

The embodied cognition folks believe that the embodiment of sentience affects cognition.

That is probably true to some extent. I mean, if we make an AI that has an orgasm each time it blows up something with a hellfire missile, it will probably learn to find ways to blow up more things more frequently and efficiently.

Our cognition is affected by pain, hunger, thirst, cold, heat, pleasure, smells, sounds, etc... positive and negative reinforcements.

scythmic_waves · on June 10, 2021

/r/MachineLearning discussion:

https://www.reddit.com/r/MachineLearning/comments/nplhy3/r_r...

I'm with most of the comments there. This paper is ridiculously hand-wavey.

JamilD · on June 10, 2021

Many of DeepMind's opinion style papers are like this. Another example of the "handwavy" DeepMind paper: https://arxiv.org/pdf/2102.03406.pdf

It's also worth it to note as well that this isn't a homogenous organization, many DeepMind employees have different opinions on issues like this and an individual paper isn't representative of the entire organization.

scythmic_waves · on June 10, 2021

> It's also worth it to note as well that this isn't a homogenous organization

Please don't consider my critique of this paper as an indictment of DeepMind as a whole!

> Many of DeepMind's opinion style papers are like this.

That's good to know. I have not read many of their opinion papers, and I'll admit I didn't have the context of it being an "opinion" paper.

That said, I don't agree with the opinion. The paper didn't really engage with the concept of AGI in a way that I found satisfying. The conclusion may very well be correct, but this paper wasn't enough to convince me.

Slightly OT: My views were reinforced when I saw the paper was praised by Patricia Churchland. I don't find her take on consciousness a satisfying one, though I find the general direction of her work interesting. See here for another example:

https://www.reddit.com/r/philosophy/comments/nvtgwr/grand_th...

rich_sasha · on June 10, 2021

Basically, any problem with a solution fits into RL: reward of 1 if you are AGI and 0 otherwise. Go learn.

This setting on its own is meaningless! The “how” of the RL agent is not even 99% of the problem, it is all of it.

Given our understanding of both DL and neuroscience, it is not even clear to me that we can say with confidence that Neural Networks are a sufficiently expressive architecture to cover an AGI.

The human brain is a deep net, sort of, but there is also plenty going on in our brains that we don’t understand. It could be that the magic sprinkle is orthogonal to DL and we just don’t know about it yet.

cynusx · on June 10, 2021

Thank you for stating what should be obvious.

I think there are two currently unsolved problems

1/ We have no idea what the reward function looks like that leads to AGI

2/ Deep networks are artificially constricted for computational efficiency and always optimized to solve the problem at hand;

Any solution that delivers AGI should rely imo on:

1/ reinforcement learning

2/ Happen with an unstructured reservoir of randomly connected neurons

There was a research trend towards reservoir computing and recurrent neural networks but this was mostly abandoned because progress in deep learning was amazing.

These techniques are akin to a 2D-plane in a 3D object, it's heavily simplified and circular references are prohibited.

I have some good ideas on what the reward function should look like in a reservoir setting and happy to discuss them with any active independent researcher in the field.

svantana · on June 10, 2021

> The “how” of the RL agent is not even 99% of the problem, it is all of it

I'm not sure that's true anymore - pretty much any objective devised is being solved by ML solutions within months (with some exceptions such as Chollet's ARC, maybe Winogrande). But those same models will perform poorly on other unseen tasks, because ML takes shortcuts if it can. We used to have unsolved tasks for decades, such as Go. It's now comparably hard (if not harder) to create a good objective measure of intelligence than to reach human parity on said measure.

rich_sasha · on June 11, 2021

Maybe I’m misinterpreting your comment, but what about self-driving cars? Hugely researched and funded area, not without success, but even the best self-driving system is so inferior to a human driver.

I’m not dissing self-driving car research, just not sure we’re anywhere close to parity, and the problem is fairly well defined.

svantana · on June 11, 2021

Right, I was talking about mathematically defined objectives and inputs (algorithms & data, in practice). With self-driving, arguably the main stumbling block is that the objective is not well-defined and the inputs are potentially wrong (faulty sensor data).

candiodari · on June 10, 2021

I assure you 99% of the problem of any RL project is the simulator. Generally you can't let an RL algorithm control anything real from the start, so you have to implement a reasonably reliable simulator for whatever you want done.

This is the big challenge in practice.

ronsor · on June 10, 2021

Correct me if I'm wrong, but wouldn't that mean the entire world would have to be simulated? Or at least some subset of society?

ggggtez · on June 10, 2021

The human brain does have a simulator. It's well known. How do you know where to move your hand to catch a ball? Or what is happening when you blink?

Your brain is constantly simulating a few milliseconds ahead.

candiodari · on June 11, 2021

That may be true, but that's not to simulate. "Dreaming" is used to "simulate", and it's only a hack, in humans probably not even explicitly programmed.

The hack is to have the simulator itself ... also be a learned system, and not to simulate the world, because you don't act on the real world, only a tiny part of it that you can measure and actually get into your brain (which is a simplified version of what you see, or a "latent variable"). There's no need to simulate anything that doesn't affect your reasoning. The information flows in reality with an intelligent actor is like this.

World (say, a tree falls) -> input representation (e.g. eyes) -> simplified version ("latent" version) -> intelligent actor -> muscles -> affects world.

Now what everybody thinks of as a simulator is something that simulates the whole thing. But if you insert one more link (output of reasoning agent at time T -> simplified representation at time T+1) you can then run a "simulation":

random simplified version ("what if your car became a tree and fell ?") -> intelligent actor -> next input for latent representation ("then what happens ?")

For safety reasons, it is probably prudent to disconnect the muscles in this state. You know, so you don't knock out your mother when you dream about boxing.

And as you say, this "predict the future" network is probably useful by itself in dealing with the world. So you can catch tennis balls and the like.

https://arxiv.org/abs/1803.10122

Or, if you like: https://www.youtube.com/watch?v=dPsXxLyqpfs

ggggtez · on June 10, 2021

Problem: you don't understand it therefore you think RL isn't sufficient.

There is no evidence that the thing you don't understand isn't based on RL too.

bobthechef · on June 10, 2021

Concepts can't be represented in matter. That's your secret sauce. Well, the beginning of the recipe, anyway. But you won't be able to make the dish.

azinman2 · on June 10, 2021

Everything old is new again.

As far as I can tell, they're not actually proposing how to achieve this. I can't access the article without a host institution it seems (is there another link?), so I only have the article to go by. RL has been the basis for all robots engaging with the world, and that engagement with the physical world modeled using RL has been promised to make robots that can act like a 2 year old for a long time (see Cynthia Breazeal's work, for example). Yet AFAIK, we haven't actually achieved this as we don't know how to efficiently model the problem to have learning rates that reach anywhere near what we're able to do with DNNs today.

Perhaps someone who has access to the paper can say why this is a milestone? If Patricia Churchland suggests it is, then something new must be happening here.

leto_ii · on June 10, 2021

> is there another link?

This is the download link: https://www.sciencedirect.com/science/article/pii/S000437022...

hervature · on June 10, 2021

I don't know who Patricia Churchland is, but they said that the paper was "very carefully and insightfully worked out."

After having read the paper, I am very disappointed in the output. Nothing concrete was shown, just hypothesis and reads more like philosophy. That being said, I would say that the paper is carefully worked out and does provide insight if you haven't thought about RL before.

Ieghaehia9 · on June 10, 2021

If Patricia Churchland doesn't have a problem with the paper despite it being philosophical, then that's probably because she is a philosopher. An eliminative materialist, to be precise.

Personally, from reading the abstract, I disagree with the hypothesis. There's a trick where anything (even say, a database lookup) looks like optimization as long as you contrive the objective function just right, but that's kind of uninformative.

LesZedCB · on June 10, 2021

and our intelligence is quite good at confabulating optimization functions for abstract processes that are inert. pretty amazing, really.

deepnet · on June 10, 2021

The paper is creative commons licensed no signup necessary to download

https://www.sciencedirect.com/science/article/pii/S000437022...

jjjdjjddddsfsd · on June 10, 2021

Are the just reformulating the principles of evolution in digital terms, and essentially not providing any new insights at all?

Yes, intelligence has been created by evolution. That doesn't imply that any system that is subject to evolutionary forces will lead to the creation of intelligence (and not within a reasonable timeframe, either). The challenge is to create a system that is capable of evolving intelligence.

Afaik some biologists even think that the evolution of intelligence was rather unlikely and would not necessarily happen again under the same circumstances as on earth.

inciampati · on June 10, 2021

As a biologist and longtime dabbler machine learning and Bayesian methods, I tend to see intelligence as a manifestation of evolution. In the case of an organism, the improvement of the model (the genome) occurs through processes that are very similar to what we see in any kind of learning (real, brain based or "artificial", computer based).

Evolution and intelligence are inextricably linked. They are practically the same thing. This means that intelligence is probably a natural result of any system similar to those that support biologics. If you flow the right amount of energy through a substrate with complex enough building blocks, you'll eventually get life ~ which is just something smart enough to survive and feed off the available energy flows. In the world, this flow is radiation from the sun, while in a computer, it is governed by a more abstract loss or fitness function.

webmaven · on June 10, 2021

> Afaik some biologists even think that the evolution of intelligence was rather unlikely and would not necessarily happen again under the same circumstances as on earth.

Hmm. Can you provide a pointer to those biologists?

AFAIK, high intelligence has arisen more than once on Earth (Hominoids, Cetaceans, Octopuses), so I'm somewhat skeptical of that claim, but perhaps they're construing intelligence more narrowly (ie. only Homo Sapiens qualifies).

jjjdjjddddsfsd · on June 10, 2021

Jared Diamond talks about it in his books (don't remember which ones specifically). OK, granted, he is not officially a biologist, I guess, but at least a prominent writer on Evolution Theory.

rubatuga · on June 10, 2021

Well that has a prior on life even existing in the first place

webmaven · on June 12, 2021

> Well that has a prior on life even existing in the first place

True, the question of intelligent life evolving can be construed as either:

"Given that life exists, what is the probability of intelligence evolving?"

Or:

"Given that the universe exists, what is the probability of life arising and evolving intelligence?"

Both are actually interesting and important questions (cf. the Drake Equation and Fermi Paradox), but I am pretty comfortable asserting that in the context of this conversation the former interpretation is more apropos.

unishark · on June 10, 2021

I'd say it's even less than that. They seem to be summarizing the ways the problem of teaching an agent to do anything (including be generally intelligent) can be formulated as a problem of maximizing a reward (hence the title).

Another way to look at it is, if we had a good enough function (e.g. a universal approximator) it can be made to model any behavior using numerical optimization. Which I think isn't very surprising, but apparently there is some arguments about it.

randcraw · on June 10, 2021

In fact, "if we had a good enough function" == "if we had sufficient funding". This refrain will resonate mightily in the willing ears of US congressfolk who want re-election and would rather talk about something other than Trump.

So welcome back to the future, and the $trillions the US spent on 20 years of space race and 50 years of cold war. The catchphrase that motivates the next 50 years of government/corporate funding will be...

They've got a Terminator and we don't.

visarga · on June 10, 2021

Evolutionary algorithms are tricky, just like deep learning. It's not "just reformulating the principles of evolution in digital terms, and essentially not providing any new insights".

blueblisters · on June 10, 2021

Current state-of-the-art in reinforcement learning can barely make a physical robot walk. In theory, with transfer learning, we will probably see better success over time but I'm looking forward to seeing results in practice.

A 2018 article about the challenges of reinforcement learning: https://www.alexirpan.com/2018/02/14/rl-hard.html

Barrin92 · on June 10, 2021

Sorry but where is actual scientific content in that paper? I'm concerned with the state of AI. saying that "reinforcement is all you need", when reinforcement learning is defined as abstract as "agent does something, adapts to environment and rewards, then does another thing" is borderline tautological.

The actual scientific question is, what are the mechanisms that make agents work, what are the fundamental modules within intelligent systems, is there a distinction between digital and biochemical systems, what costs are there in terms of resources and energy to get to a certain level of intelligence, and so on. Real questions with specific answers. For all the advances coming from just upping the amount of data and GPU hours, there is so little progress on trying to have a model of the structures that underpin intelligence.

_qfi9 · on June 10, 2021

i think part of what they are saying is that your approach is wrong, (e.g. looking for then copying submodules within intelligence won't generalize),

trying to answer specific questions won't generalize,

but if you train a network with the right potentially hacky series of rewards/rich enough environment you could get a much more general intelligence

a new kind of science

dannyw · on June 10, 2021

Alternate title: DeepMind fails to make progress on AGI, publishes thought piece instead.

visarga · on June 10, 2021

And the entitlement we have is even higher than the difficulty of the task and the hard work people are putting in. Anyone here can say they did as much for RL?