How do you “pull the plug” on a datacenter, or on all of the cloud providers if the ASI has processes running everywhere? Given that anyone with a credit card can already achieve robust multi-region deployments, it doesn’t seem hard for an ASI to make itself very hard to “turn off”.
Alternatively an ASI can ally with a group of humans to whom it can credibly promise wealth and power. If you think there is a baby machine god in your basement that is on your side, you’ll fight to protect it.
Airgap it. Give it no connections to the outside world, just a single controlled interface with a human operator. It's reduced to an advisor at this point, not an agent, but it removes most potential harm short of tricking its operators to plug it into the Internet.
In this scenario, pulling the plug is a matter of turning off power to the data centers it runs in - or simply disabling the one mode of external communication it has.
I keep hearing this argument, and it is the worst one of all because it neglects human greed.
AI feeds on data. If it can make you a million dollars air gapped, you'll be able to make a billion with it plugged in the net with it manipulating data.
Historically, the world has not been great at universally coordinating responses to social problems - especially when it only takes one actor to break the “truce.”
Isn't agreeing to only run an AGI with whatever theoretical alignment controls we come up with, also a social agreement? Seems we will have to figure that out one way or another.
The airgap option is “everybody agrees to leave lots of money on the table in the name of safety”. The alignment option is “some people invest lots of money to invent technology that everyone can use to safely make lots of money”.
If you solve alignment, the regulatory nudge required for folks to use it is probably not that big; let’s be really pessimistic and say it’s 10x the compute to run all the supervision. It’s probably mostly criminals that want the full-unaligned models, and so most entities can agree to the spend.
In the air gap case, I’m pretty confident that it is many orders of magnitude more profitable to run the “unsafe” / non-airgapped version. The utility of baking AI into everything is massive. Maybe in the best case you can have your airgapped ASI build provably-safe non-AGI tool AIs and capture much of that value, but I’m pretty skeptical it’s safe to transport anything that complex across the airgap.
So the alignment research path is more sustainable since it requires paying a much lower “safety premium”.
But I’m all for pursuing both, I don’t think they interfere with each other, quite the opposite; taking either one seriously reinforces the need for both of them.
Another advantage of the alignment path is we can meaningfully make progress now. It’s really hard to get people to agree ahead of time that they will run their AGI in an airgap.
A sufficiently intelligent system that is un-aligned can likely subvert a human operator if it wants to.
But even if we ignore that, note that nobody is building their systems this way, and nor will they without extremely draconian laws requiring it. An airgapped system is substantially less valuable than one that is connected to the outside world.
I take the question “why can’t we turn it off” to refer to the actual real systems that we have built and will continue to build, not hypothetical systems we might build if we took safety risks very seriously.
Alternatively an ASI can ally with a group of humans to whom it can credibly promise wealth and power. If you think there is a baby machine god in your basement that is on your side, you’ll fight to protect it.