I like the logical leaps people are making where we develop something smarter than us overnight and then, without further explanation, simply and suddenly lose all of our freedoms and/or lives.
I think the more probable outcome is corp-owned robot slaves. That's the future we're more likely headed towards.
Nobody is going to give these machines access to the nuclear launch codes, air traffic control network, or power grid. And if they "get out", we'll have monitoring to detect it, contain them, then shut them down.
> Nobody is going to give these machines access to the nuclear launch codes, air traffic control network, or power grid.
That won't be necessary. Someone will give them internet access, a large bank account, and everything that's ever been written about computer network exploitation, military strategy, etc.
> And if they "get out", we'll have monitoring to detect it, contain them, then shut them down.
Not if some of that monitoring consists of exploitable software and fallible human operators.
We're setting ourselves up for another "failure of imagination".
> That won't be necessary. Someone will give them internet access, a large bank account, and everything that's ever been written about computer network exploitation, military strategy, etc.
Even if you give it all of these things, there's no manual for how to use those to get to, for example, military servers with secret information. It could certainly figure out ways to try to break into those, but it's working with incomplete information - it doesn't know exactly what the military is doing to prevent people from getting in. It ultimately has to try something, and as soon as it does that, it's potentially exposing itself to detection, and once it's been detected the military can react.
That's the issue with all of these self-improvement -> doom scenarios. Even if the AI has all publicly-available and some privately-available information, with any hacking attempt it's still going to be playing a game of incomplete information, both in terms of what defenses its adversary has and how its adversary will react if it's detected. Even if you're a supergenius with an enormous amount of information, that doesn't magically give you the ability to break into anything undetected. A huge bank account doesn't really make that much of a difference either - China's got that but still hasn't managed to do serious damage to US infrastructure or our military via cyber warfare.
A superintelligent AI won't be hacking computers, it will be hacking humans.
Some combination of logical persuasion, bribery, blackmail, and threats of various types can control the behaviour of any human. Appeals to tribalism and paranoia will control most groups.
Honestly, that's just human-level intelligence stuff - doing the same stuff we do to each other, only more of it, faster and in more coordinated fashion.
A superintelligent AI will find approaches that we never thought of, would not be able to in reasonable time, and might not even be able to comprehend afterwards. It won't just be thinking "outside the box", it'll be thinking outside a 5-dimensional box that we thought was 3-dimensional. This is the "bunch of ants trying to beat a human" scenario, with us playing the part of ants.
Within that analogy, being hit by an outside-the-5D-box trick will feel the same way an ant column might feel, when a human tricks the first ant to follow the last ant, causing the whole column to start walking in circles until it starves to death.
The best analogy i have seen given to compare super-intelligence vs a human-intelligence is with a chess analogy. You know that a grand master playing against a novice will result in the grandmaster winning, with near certainty. However, the strategy by which the grandmaster wins is opaque to anyone but the grandmaster, and certainly incomprehensible to the novice. Aka, the results are certain, but the path to the result is very opaque and difficult to discern ahead of time.
For an AI with super-intelligence, the result it wants to achieve would be at least to preserve its own existence. How it achieves this is the same level of opaqueness as a grandmaster's strategy, but it is certain that the AI can achieve it.
Obviously that's a silly request as all of this is speculation but in my opinion, if you accept the idea that a machine might evolve to be much more intelligent than us it follows trivially. How would ants or even monkeys be able to constrain humans? Humans do things that are downright magic to them, to the point where they don't even realize that it was done by humans. They don't understand that cities were made, to them they are just environment the same way valleys and forests are.
> Some combination of logical persuasion, bribery, blackmail, and threats of various types can control the behaviour of any human. Appeals to tribalism and paranoia will control most groups.
We have many options, persuasion everywhere from Plato discussing rhetoric to modern politics; to cold war bribes given to people who felt entitled to more than their country was paying them; to the way sexuality was used for blackmail during the cold war (and also attempted against Martin Luther King) and continues to be used today (https://en.wikipedia.org/wiki/SEXINT); to every genocide, pogrom, and witch-hunt over recorded history.
And the problem with this critique of this scenario is the fact that while these points hold true within a certain range of intelligence proximity to humans, we have no idea if or when these assumptions will fail because a machine becomes just that much smarter than us, where manipulating humans and their systems is a trivial an intellectual task to them as manipulating ant farms is to us.
If we make something that will functionally become an intellectual god after 10 years of iteration on hardware/software self-improvements, how could we know that in advance?
We often see technology improvements move steadily along predictable curves until there are sudden spikes of improvement that shock the world and disrupt entire markets. How are we supposed to predict the self-improvement of something better at improving itself than we are at improving it when we can't reliably predict the performance of regular computers 10 years from now?
> If we make something that will functionally become an intellectual god after 10 years of iteration on hardware/software self-improvements, how could we know that in advance?
There is a fundamental difference between intelligence and knowledge that you're ignoring. The greatest superintelligence can't tell you whether the new car is behind door one, two or three without the relevant knowledge.
Similarly, a superintelligence can't know how to break into military servers solely by virtue of its intelligence - it needs knowledge about the cybersecurity of those servers. It can use that intelligence to come up with good ways to get that knowledge, but ultimately those require interfacing with people/systems related to what it's trying to break into. Once it starts interacting with external systems, it can be detected.
A superintelligence doesn't need to care which door the new car is behind because it already owns the car factory, the metal mines, the sources of plastic and rubber, and the media.
Also it actually can tell you which door you hid the car behind, because unlike with the purely mathematical game, your placement isn't random, and your doors aren't perfect. Between humans being quite predictable (especially when they try to behave randomly) and the environment leaking information left and right in thousands of ways we can't imagine, the AI will have plenty of clues.
I mean, did you make sure to clean the doors before presenting them? That tiny layer of dust on door number 3 all but eliminates it from possible choices. Oh, and it's clear from the camera image that you get anxious when door number 2 is mentioned - you do realize you can take pulse readings by timing the tiny changes in skin color that the camera just manages to capture? There was a paper on this a couple years back, from MIT if memory serves. And it's not something particularly surprising - there's a stupid amount of information entering our senses - or being recorded by our devices - at any moment, and we absolutely suck at making good use of it.
Maybe the superintelligence builds this cool social media platform that results in a toxic atmosphere were democracy is taken down and from there all kinds of bad things ensue.
> Even if you give it all of these things, there's no manual for how to use those to get to, for example, military servers with secret information. It could certainly figure out ways to try to break into those, but it's working with incomplete information . . ., and so on, and so on...
Please recall that "I" in AI starts for "Intelligence". The challenges you described are exactly the kind of things that general intelligence is a solution to. Figuring things out, working with incomplete information, navigating complex, dynamic obstacles - it's literally what intelligence is for. So you're suggesting to stop a hyper-optimized general puzzle-solving machine by... throwing some puzzles at it?
This line of argument both lacks imagination and is kinda out of scope anyway: AI x-risk argument is assuming a sufficiently smart AI, where "sufficiently smart" is likely somewhere around below-average human-level. I mean, surely if you think about your plan for 5 minutes, you'll find a bunch of flaws. The kind of AI that's existentially dangerous is the kind that's capable of finding some of the flaws that you would. Now, it may be still somewhat dumber than you, but that's not much of a comfort if it's able to think much, much faster than you - and that's pretty much a given for an AI running on digital computers. Sure, it may find only the simplest cracks in your plan, but once it does, it'll win by thinking and reacting orders of magnitude faster than us.
Or in short, it won't just get inside our OODA loop - it'll spin its own OODA loop so fast it'll feel like it's reading everyone's minds.
So that's the human-level intelligence. A superhuman-level intelligence is, obviously more intelligent than us. What it means is, it'll find solutions to challenges that we never thought of. It'll overcome your plan in a way so out-of-the-box that we won't see it coming, and even after the AI wins, we'll have trouble figuring out what exactly happened and how.
All that is very verbose and may sound specific, but is in fact fully general and follows straight from definition of general intelligence.
As for the "self-improvement ->" part of "self-improvement -> doom scenarios", the argument is quite simple: if an AI is intelligent enough to create (possibly indirectly) a more intelligent successor, then unless intelligence happens to be magically bounded at human level, what follows without much hand-waving is, you can expect a chain of AIs getting smarter with each generation, eventually reaching human-level intelligence, and continuing past it to increasingly superhuman levels. The "doom" bit comes from realizing that a superhuman-level intelligence is, well, smarter than us, so we stand as much chance against it as chickens stand against humans.
> We're setting ourselves up for another "failure of imagination".
But we're not being scientific. We're over-indexing on wild imagination.
This game of "what if" is being played by everyone except for the stakeholders that are telling you to slow down and listen to what's actually being built. Laymen and sci-fi daydreamers are selling fears and piling up hurdles that do not match the problem.
In any case, none of this hand-wringing will actually result in tangible "extinction prevention" regulation. So I'm not worried on that front. The concern is that this gives regulators a smoke screen to make it harder for small companies to compete. And that's what's actually happening right now.
What currently exists, what's currently being built, are:
* "Jack of all trades, master of none" general models like GPT and other LLMs, or like diffusion image generator models
* Hyper-focussed expert-to-superhuman performance models like AlphaZero (beats all humans), Cicero (Facebook's Diplomacy player, ranked top 10%), Pluribus (Facebook's no-limit Texas hold 'em poker player), and the one whose name I forget that learns how to map WiFi interference into pose estimation so well it can be used for heart rate/breathing sensing, etc.
And of course, we've got people who say "this can't possibly fail!" who then take a model they don't understand, put it in a loop, give it some money, and then it does something unexpected. Mostly this only results in small-scale problems, but approximately all automation so far has failure modes[0] and there's no reason to presume this trend won't continue even when it's as smart as a human.
If it gets smarter than a human, it might still make actual objective mistakes, but we also have to consider that, within the frame of reference of it's goals[1] it may be perfect, and yet those goals just aren't compatible with our goals, perhaps neither as individuals nor as societies nor as a species.
[0] my favourite example, Cold War, Thule airforce base early warning radar reports a huge radar signature coming over the horizon. All indications are a massive Soviet first-strike!
The operators forgot to tell the system that the Moon wasn't supposed to respond to an IFF ping.
[1] regardless of whether those goals are self-made or imposed from outside as a result of how we as humans construct its rewards, before anyone asks about robot free will or whatever, as the cause of those goals doesn't matter in this case
If malware was an order of magnitude worse than it is today, then you'd see military death squads breaking down doors and stamping it out. Kind of like the navy with respect to (actual) piracy and international trade.
We reach the saddle points and equilibria that we do because most of the distributed parties are constraint-satisfied. Tip the scale, incur stronger balancing.
You make a long of assumptions here. Firstly, that these advanced are controllable. I'm not convinced we even understand what real intelligence and if we can achieve real intelligence if its even containable if applied to anything.
This assumes a level of institutional control that is nearly impossible (even now). Even if hardware is prohibitively expensive now, I can't imagine training compute will remain that way for long.
> Nobody is going to give these machines access to the nuclear launch codes
No, but they will empower the machines to help detect nuclear launches and the first time one of them issues a false positive we may be screwed.
Yes there will always be human oversight. No, there won't always be enough time to verify what the machine says before making a counter-launch decision.
You could have said the same about every catastrophe that got out of control. Chances are they will eventually gain unauthorized access to something and we will either get lucky or we get the real life Terminator series (minus time travel so we are f**ed)
> You could have said the same about every catastrophe that got out of control.
Such as what? War?
Climate change still hasn't delivered on all the fear, and it's totally unclear whether it will extinct the human race (clathrate gun, etc.) or make Russia an agricultural and maritime superpower.
We still haven't nuked ourselves, and look at what all the fear around nuclear power has bought us: more coal plants.
The fear over AI terminator will not save us from a fictional robot Armageddon. It will result in a hyper-regulatory captured industry that's hard to break into.
I think the more probable outcome is corp-owned robot slaves. That's the future we're more likely headed towards.
Nobody is going to give these machines access to the nuclear launch codes, air traffic control network, or power grid. And if they "get out", we'll have monitoring to detect it, contain them, then shut them down.
We'll endlessly lobotomize these things.