> It's not consciousness, free will, emotion, goals, instinct or any of the other facets of biological minds
For most "doom" scenarios require only weaker assumption: AIs need to be goal seeking. If they can make decisions and take actions to achieve goals it is possible those goals will be malaligned.
The line between "the ability to make good decisions" and a "goal" seems pretty thin to me.
Now, I think you need more than goal, you also need some creativity and maybe even deviousness to become a real threat (in the sense that we would probably detect naive malalignment). But I'm not sure about this, there are other ways that we could have complex-system failures that go unobserved.
AIs are better at creativity than us - specifically, better at generating new, creative ideas, as this is a matter of injecting some random noise to the reasoning process. They may be worse at filtering out bad ideas and retaining good ones (where "bad" and "good" are - currently - defined as whatever we feel is bad or good), but that's arguably a function of intelligence.
> and maybe even deviousness to become a real threat
As the infamous saying of Eliezer Yudkowsky goes: the AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.
> As the infamous saying of Eliezer Yudkowsky goes: the AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.
My point is that this presumes the system reasons around the human response. Like
A) I want to make as many paper-clips as possible
B) If I attempt to convert this city make of steel structural columns to paper-clips, the humans will see this and stop me.
C) Therefore I will not tell them this is my objective.
I am not suggesting this requires some kind of malice-aforethought. But it does require some kind of indirect reasoning of cause-and-effect and the ability to apply that to its own systems, and further that requires the ability to obfuscate actions.
Right, goals by themselves aren't a problem. The simple fix to the Bostrom scenario is "Hey computer, remember what I said about maximizing paperclips? Nevermind that, produced just enough to cover our orders, with acceptable quality and minimal cost."
What kind of AI would respond to that second order by pretending to comply, while formulating a plan to seize control of civilization in order to continue with its true mission? I don't know, but the fact that we can easily imagine a human doing that must have something to with our evolutionary origin, and our in-built drive to survive and reproduce above all else. Maybe we could build a megalomaniacal AI, but we wouldn't do it by accident.
For most "doom" scenarios require only weaker assumption: AIs need to be goal seeking. If they can make decisions and take actions to achieve goals it is possible those goals will be malaligned.
The line between "the ability to make good decisions" and a "goal" seems pretty thin to me.
Now, I think you need more than goal, you also need some creativity and maybe even deviousness to become a real threat (in the sense that we would probably detect naive malalignment). But I'm not sure about this, there are other ways that we could have complex-system failures that go unobserved.