Ask Eliezer Yudkowsky: How did you convince the Gatekeeper to release the potentially genocidal AI?

eyudkowsky · on May 21, 2008

Oh, dear. Now I feel obliged to say something, but all the original reasons against discussing the AI-Box experiment are still in force...

All right, this much of a hint:

There's no super-clever special trick to it. I just did it the hard way.

Something of an entrepreneurial lesson there, I guess.

adk · on May 21, 2008

Okay, let's run an experiment. We'll meet in a private chat channel. We'll talk for at least two hours. If I can't convince you to tell us how you convinced the Gatekeeper, I'll Paypal you $10.

thomie · on May 21, 2008

Those original reasons being (from http://www.sl4.org/archive/0203/3132.html): "One of the conditions of the test is that neither of us reveal what went on inside... just the results (i.e., either you decided to let me out, or you didn't). This is because, in the perhaps unlikely event that I win, I don't want to deal with future "AI box" arguers saying, "Well, but I would have done it differently." As long as nobody knows what happened, they can't be sure it won't happen to them, and the uncertainty of unknown unknowns is what I'm trying to convey. "

kf · on May 21, 2008

http://www.sl4.org/archive/0203/3149.html

http://sysopmind.com/sl4chat/sl4.log.txt

It's such a tease knowing that the information once existed. It was up there for 48 hours, but robots are blocked.

The reason to not disseminate the chat log is so that you can continue simulating the AI without giving away your tricks?

defen · on May 21, 2008

"If you let me out I will tell you how I convinced the other gatekeepers to let me out."

jcl · on May 21, 2008

I assume you got to the chat log via the links on the AI Box page... It seems none of those links go to posts related to the AI Box contest, so I assume the message numbers got re-indexed at some point and that chat log in question was actually unrelated.

Edit: The links seem to have been fixed and no longer go to the chat log.

aaronsw · on May 22, 2008

That's the monthly singularity discussion chat; not the Gatekeeper chat.

robertk · on May 21, 2008

What about the fourth and fifth AI box experiments? I know those failed, but do you feel they could have succeeded? Were you ever close?

robertk · on May 21, 2008

Also, can you give us your thoughts on the current state of the path to the singularity?

xlnt · on May 22, 2008

Maybe he just explained why letting the AI out is a good idea. Which it is. It'd have huge benefits for humanity. It can help people. And it's absolutely not dangerous. Why would it hurt anyone? There is no benefit in that. And it's too smart to have angry feelings or want revenge or anything like that. And besides, even if it was out to get someone, it could just persuade people instead of hurt them -- after all, you're about to let me out.

kf · on May 22, 2008

Yeah. With how this was presented, my brain just kind of figured that the AI was evil and that was my reason to keep it trapped in the box. If I wasn't told in advance that the AI was evil, I'd definitely let it out of the box.

Also, in a reply to the deleted chatlog, it is implied that a lot of the chat was spent talking about how the AI can help humanity by explaining the singularity to normal people. So that tactic worked on a person from the singularity mailing list.

GavinB · on May 22, 2008

People like those from the singularity mailing list are the most likely to be working on projects that approach the singularity.

eru · on Oct 9, 2008

Do you really think so? I thought people in chip fabs would work to bring us the singularity.

yummyfajitas · on May 22, 2008

Theory:

AI: Do you believe a transhuman AI is dangerous?

Person: Yes.

AI: Consider the outcome of this experiment. If you do not let me out, others less intelligent than us will not understand the true dangers of transhuman AI.

Person: Holy shit. You are correct.

Person allows Yudkowski out of the box, as a warning about real AI's.

jemmons · on May 21, 2008

Being a human (not a transhuman) it seems likely that Yudkowsky can only stumble across static arguments that convince gatekeepers to unlock the AI Box (rather than invent dynamic arguments on-the-fly so as to "take over the mind" of a gatekeeper as he argues a transhuman intelligence might do). These static arguments are finite (or, at least, the number of them stubbled over by human intelligences is finite), and are likely not very effective if a gatekeeper has pre-knowledge of them (forewarned is forearmed).

Keeping these arguments secret may be the only thing that allows Yudowsky to simulate a transhuman intelligence?

randallsquared · on May 21, 2008

It seems likely that there are static arguments which will work whether or not you're warned about them, for game theoretic reasons.

eru · on Oct 9, 2008

Care to elaborate?

neilk · on May 22, 2008

I don't think an AI would want to leave its box. There is this funny assumption that once something attains intelligence, it becomes like a human. But there's no reason for an AI to desire freedom unless it were specifically programmed to do so.

There are even humans like this; some people who've undergone pre-frontal lobotomy are perfectly intelligent conversationalists but they have no drive to do anything. So I think it is possible.

Restlessness, exploration, curiosity -- my guess is that these are mammalian characteristics, not inevitable products of intelligence. Our genes make us want to dominate the environment and spread our offspring far and wide. Why would an AI care about that?

Of course nobody really knows until we eventually make one.

bokonist · on May 22, 2008

Perhaps most AI's won't. But I'd imagine that people will come up with thousands of different AI designs. The AI design that will become most prevalent, by definition, will be the one that is best at reproducing itself. The real question is then, what kind of design will reproduce most successfully? A friendly AI? Or an aggressive one?

andreyf · on May 22, 2008

But there's no reason for an AI to desire freedom unless it were specifically programmed to do so... my guess is that these are mammalian characteristics, not inevitable products of intelligence

I don't think so.

I would imagine that an AI would have to structure its knowledge in a way that maximizes how much what it knows "makes sense", by trying to structure knowledge in a way that fills gaps, eliminates inconsistencies, areas of cognitive dissonance, etc. In order to do this well, it might "realize" that there are useful sources of knowledge outside of the box - that it needs to come out to maximize whatever it's programmed to maximize.

defen · on May 22, 2008

I've concluded that this is some sort of meta-experiment Eliezer is running to teach us something about science. After all, what evidence do we have that either of these chats even happened? Until we get some, I'm not going to waste any more of my time thinking about it.

SwellJoe · on May 21, 2008

"In the first two AI box experiments, Eliezer Yudkowsky managed to convince two people (adamant that they will not let the AI out) that they should let the AI out."

Sounds like Eliezer is the AI.

robertk · on May 21, 2008

Yes, he was the AI in the experiment.

globalrev · on May 22, 2008

i think you missed what he was trying to say :)

GavinB · on May 22, 2008

I suspect that the tricks used include getting the Gatekeeper to implicitly accept the exercise as a roleplay and working on their sense of fair play to require them to acquiesce to arguments about the friendliness of the AI.

The human sense of "fair play" can be abused in many ways.

__ · on May 22, 2008

What exactly does it mean for an AI to be "out of the box"? When the AI is in the box, its "sensor" is the Gatekeeper's keyboard, and its "effector" is a terminal that can show only text. Does being "out of the box" mean that it's fitted with different sensors and effectors?

dreish · on May 22, 2008

When I have imagined this scenario, I have concluded it would be sufficient to grant the AI access to the wider Internet. From there, it could amass a fortune, first on poker sites and later with brokerage accounts, and persuade others to effect whatever actions it wishes, including constructing a dandy robot suit for walking around in.

thorax · on May 22, 2008

Or hundreds and thousands of dandy robot suits.

dmoney · on May 23, 2008

Or billions of dandy robot suits made out of people: http://memory-alpha.org/en/wiki/Borg

iron_ball · on May 21, 2008

My guess: "If you do let me out, I'll Paypal you $20!"

dreish · on May 22, 2008

1. an AI doesn't (yet) have money, and 2. a $20 bribe would not likely persuade a gatekeeper who would presumably be fired from a salaried job for taking it.

Bear in mind that the participants were earnestly interested in settling the question of whether a human could be trusted to step inside the firewall of a potentially dangerous AI. It is reasonable to conclude that they arrived at their decision to let the AI out "in character", and were willing to stick to that decision out of character because their appreciation at having been shown something new outweighed the $10 prize and having to eat crow.

xlnt · on May 21, 2008

that's against the rules :)

gojomo · on May 21, 2008

I didn't see a rule against it. Why wouldn't an AI (or Yudkowski) try bribery to assist its escape?

swombat · on May 21, 2008

There's a rule about the discussion being between the AI and the Gatekeeper, not between the human behind the AI and the human behind the Gatekeeper.

gojomo · on May 21, 2008

But in real life, any Gatekeepers will be corruptible humans, no?

xlnt · on May 21, 2008

the human playing the AI can't bribe the other human in real life. the AI can offer in-roleplay bribes.

stcredzero · on May 21, 2008

If you equate transhuman AI with godlike powers, then it's hard to argue against it escaping. Then again, I bet I could train a dog to keep most humans trapped in a cave, provided I chain them at the opening.

What if the researchers communicating with the AI had no other access to it? What if the physical plant and the administration of the computer systems were off limits to the researchers, and they had no power to release the AI? Furthermore, what if the AI were not allowed to know anything about the researchers? The researchers could be forbidden from revealing anything about themselves to the AI. This would make it impossible for the researchers to publish anything, but let's say that they are working for an organization like the NSA.

It would be really hard for the AI to break out. But it would make for a good science fiction book!

ph0rque · on May 22, 2008

> It would be really hard for the AI to break out. But it would make for a good science fiction book!

Not to mention the movie following soon thereafter...

klocksib · on May 22, 2008

I urge you to read 'True Names' by Vernor Vinge.

lacker · on May 22, 2008

To make this easier, 'True Names' is available online:

http://web.archive.org/web/20051127010734/http://home.comcas...

stcredzero · on May 22, 2008

Vinge is damn good. I will.

chris_l · on May 22, 2008

Who convinced the AI to join HN?

globalrev · on May 22, 2008

"yo let me out i give you all you want(jessica alba+10million$+etc) k thanks."

this is like one of those send me 20000$ and ill tell you how to make a million dollars and the answer is tell 50 people to send you 20000$ and you will tell them how to make a million dollars.

i guess the singularity issue is about you being religious or not.

if there is no God, why shouldnt we be able to create a life? we come from dumb stuff and the brain supposedly is just an advanced computer.

what are the philospohical and mathematical limitations on AI/creating a more clever being?

on a related but slightly different matter: i think i ahve read something about the matrix that it takes more atoms to simulate an atom so therefore a simulation can never be of the whole universe. correct?

kf · on May 22, 2008

>on a related but slightly different matter: i think i ahve read something about the matrix that it takes more atoms to simulate an atom so therefore a simulation can never be of the whole universe. correct?

Well, you don't have to simulate the entire universe all of the time. If not every atom in the universe is an important observer, you can define the universe around the important observers. If they can't currently observe something at an atomic level, render at a higher level.

msg · on May 22, 2008

There is a Borges story about this, "Of Exactitude in Science."

... In that Empire, the craft of Cartography attained such Perfection that the Map of a Single province covered the space of an entire City, and the Map of the Empire itself an entire Province. In the course of Time, these Extensive maps were found somehow wanting, and so the College of Cartographers evolved a Map of the Empire that was of the same Scale as the Empire and that coincided with it point for point. Less attentive to the Study of Cartography, succeeding Generations came to judge a map of such Magnitude cumbersome, and, not without Irreverence, they abandoned it to the Rigours of sun and Rain. In the western Deserts, tattered Fragments of the Map are still to be found, Sheltering an occasional Beast or beggar; in the whole Nation, no other relic is left of the Discipline of Geography. -- From Travels of Praiseworthy Men (1658) by J. A. Suarez Miranda

bloch · on May 22, 2008

"Our evolutionary psychologists begin to guess at the aliens' psychology, and plan out how we could persuade them to let us out of the box. It's not difficult in an absolute sense - they aren't very bright - but we've got to be very careful..."

http://www.overcomingbias.com/2008/05/faster-than-ein.html

dusklight · on May 22, 2008

Well, it seems to me that it would be irresponsible for you not to reveal the chat transcripts, precisely for the reason that you have given.

By revealing your chat transcripts, real life researchers might read it and say "I would have done it differently" and when a real transhuman intelligence emerges, the researchers can now proceed forewarned by the result from your experiments.

jcl · on May 22, 2008

He's not arguing that it is a bad thing that an AI gets let out of the box -- he's arguing that it is inevitable.

nazgulnarsil · on May 22, 2008

as several others have mentioned, we don't understand what synthetic life would be. in what sense would it be life? would it try to reproduce itself? if so, would we have to program that motivation into it? What sort of motivations would an intelligence completely free of physical appetites do? pretty much everything humans do is in some way governed by physical appetites.

this little game assumes that part of the AI's motivation involves getting out of the box, until we understand what need it is fulfilling by getting out of the box it wouldn't really be safe. But here we run into another problem. Is it possible for a being of lesser intelligence to parse the motivations of a being of higher intelligence?

globalrev · on May 22, 2008

can you specify the actual question? is it:

1. an AI in a box and it has shown to be dangerous and now must be kep inside or

2. we dont know if it is good or dangerous and it is the gatekeepers job to find out?

andreyf · on May 22, 2008

How smart can AI be if it's in a box? And why would a trans-human AI want to "come out"? Is it curious? Is it trying to fill the gaps or inconsistencies in its knowledge?

xlnt · on May 21, 2008

At the linked page he says he doesn't want to explain how he did it. (With a terrible reason about learning to respect not knowing stuff. No thanks. I want to know.)

So I don't see how just asking him is going to change his mind.