Threat of a super-advanced AI to humanity, estimated as "likely" in paper

Eduardo Ficaria

prosaparte
Joined
Aug 2, 2022
Messages
12
Location
Spain
I read on this VICE article that "researchers from the University of Oxford and Google Deepmind" have published a paper where they study the matter of "whether a superintelligent AI could break bad and take out humanity" and conclude that it’s “likely”. The main reason they give for this is that such AI would compete and eventually outsmart us for exclusive access to the limited resources and energy available.

I haven´t read the paper yet, but I only see this possible if such AI was something like the famous Skynet from the Terminator franchise: an AI with access either to industrial facilities were it would be able to build whatever it needed to accomplish its goals, or nanomachinery that it could use to provoke a "grey goo" situation. Another interesting idea I get from this is that the AI doesn't really need to be a complete mind as a human one. It could be an extremely efficient and smart machine just trying to reach its goals, meaning that it would try to wipe us out not out of malice or anything of the sort, but because of it having a very narrow or nil capacity to understand the effects or consequences of its actions.

So, any opinions on this? Like happy thoughts of cyberpunkish death, doom and destruction?
 
I don't think AI is quite as advanced as many people seem to think. In fact, I believe there is some confusion over the nature of AI and how it differs from automation. In automation, the output is determined from the input via an algorithm provided by the programmer. In my definition of AI, there is no algorithm specific to a task, only an underlying 'algorithm of thought' that allows the process of generating the output from the input to be 'reasoned' by the AI system itself. In other words, the machine would have to work out how to do something without ever having been specifically programmed to do that exact task. I would probably go so far as to disallow the use of database responses. In other words, an AI chatbot would not refer to a database of previous answers. If it does, it is an example of automation not AI.

Based on my definition above, I would argue that AI has not actually been demonstrated yet. Rather, the accepted definition of AI has been repeatedly downgraded to the point where certain researchers are beginning to claim success. But while we have had automation for decades, I believe it will be many decades before we have AI.

One of my objections to Musk - frequently expressed on these forums - is my belief that full self driving technology requires AI, which we don't have (rather than simply automation, which we do). All these arguments about sensor technology (Lidar vs cameras etc) are somewhat superfluous to me. Road use is sufficiently chaotic (no two roads are the same, no single road is even the same on different days, no two driver decisions are exactly the same) that AI is an absolute requirement for self driving. Now, some highways are sufficiently alike to allow an approximation of self driving. And a road system specifically designed to be standard and repeatable might allow self driving with today's technology. But that would be automation, not AI. And we would have to start from scratch on the road system.

Musk's claim that owners will soon be able to send their cars out to operate as "robotaxis" demonstrates that he is either a liar and a fraud or grossly ignorant of his own technology. Best of luck to the owners who are now suing (he has been taking real money from people for these imaginary technologies for quite some time now).

Regarding whether AI could destroy the world: I find it hard to believe we will get that far before doing the job ourselves.
 
It always puzzle me why people of the future would deliberately give an AI* access to world ending force. I mean, we've all see Terminator and we'd be the ones making the actual AI.

However there are automated systems today that technically could wipe us out - the US and Russian 'dead hand' nuclear systems** that are there to cause mutual destruction if a first strike by the other side is detected and somehow they managed to wipe out the humans in charge of their nuclear weapons. I suppose if these systems were working there could be an unlikely chain of events could lead to entire nuclear arsenals being unleashed by mistake. We do know of both Soviet and US systems picking up false positives of nuclear weapons being launched during the cold war - thankfully in all cases they were adjudged to be instrument errors.

One plan I believe the Soviets thought about, involved having a fleet of nuclear powered, fully automated freighters that would slowly cruise around the worlds oceans, each of them with 'Tsar Bomb' strength cobalt nuclear weapons. They were on a similar dead hand system, hence any tampering or loss of radio signal (I presume, due to a successful first strike on the Soviet Union) would cause all these massive weapons to denotate and essentially coat the world in radioactive cobalt. Mutually assured destruction achieved. At least that was the idea.

This idea is a bit ponderous and probably just not viable (what if these ships just started sinking via natural actions.) But this was, I believe, quite early in the cold war, when the US had about a 17 to 1 advantage in nuclear bombs and there were no solid fuel intercontinental missiles that could be fired immediately.*** The Soviets had to think outside the box to try and compete.




--------------------------------------------
* I am somewhat sceptical that such a technology is even close - how can one create 'intelligence' when we remain in the dark how our own ape intelligence actually 'works'...but let's assume that it's possible.

** I believe are not switch on, but who knows, maybe they are switched on right now...

*** The first intercontinental missile, made by the Soviets, required 24 hours to fuel and set up, from memory.
 
I have two initial problems with the article, which may or may not be a fair representation of the original paper which I haven't looked at.

1: It feels like they have modelled their AI behaviour on humans. The main generic scenario - machine has a task, gets a reward for success, destroys us (or the world) trying to get the most reward - puts me in mind of something I came across years ago, either in a quality management book or one on human behaviour, where people did essentially the same thing. The scenario was a customer support office where staff got a bonus for the number of calls handled, and then they discovered that there was a phone in the office which could dial out, so they could phone themselves and boost their call-handling rates. Clearly they were not going to destroy the world, and there are only so many fake calls that could be generated in a day, but the principle is the same.

2: There seemed to be an assumption that the AI would acquire absolute control of everything and humans would be "out-thought" at every step. That strikes me as a very simplistic scenario and ignores a whole bunch of routine challenges.
a: People are idiots who do stupid things often against their better interest, some or all of which might thwart the AI, but the AI has to somehow learn all the stupid (and self-harming) things people might do. Whilst that might be theoretically possible, the AI has to survive every piece of unpredictable stupidity with no prior training, and there is no way to "out think" an idiot human doing something recklessly stupid and destructive on the spur of the moment. This also puts me in mind of the Darwin Awards and the one which went something like "and were last seen sitting on top of the gas storage tank lighting fireworks." (I also remember, as a kid, the public safety ads warning people that if they suspected a gas leak not to go looking with a lighted match or candle.)
b: AI systems, like people, are at the mercy of the random events of the world. Unless the AI hardware is perfectly designed and protected from the elements then the next massive electrical storm, seismic event, solar flare incident or heat-wave could take it down. The obvious one that springs to mind is the recent failure of server farms due to the air-con failing to cope with recent heat-waves. Again, perhaps the AI could learn to defend against those things, but without experience to go on, the first disaster could be the end of the AI. The article implies the threat is from a single AI, but the way that humans overcome these natural disasters and learn for the future is that there are lots of us and the ones that survive try to devise ways around the next instance of that particular disaster.
c: The AI is going to need to figure out self-repair, redundancy and all the other things that humans do to keep IT systems ticking over - regular maintenance, system backups, routinely replacing hardware before it fails. The AI needs to learn that from somewhere (and, OK, it might learn it from us) and do it before the hardware failure kills it. We humans do this stuff with our IT because after the first few rounds of "the disc crashed and wiped every version and record of my novel, again", the smarter humans came up with the idea of backups. As with (b) above, the AI has to think of that first.
d: Can an AI be any better than humans at expecting the unexpected? This is really a variation on some of the stuff above, but even when we plan things really carefully, there is always something that gets left out. If it's missing the jar of pickles in a picnic, nobody dies, but if it's finding a quicker and cheaper way to do something with unforeseen consequences, such as the construction methods of the DeHavilland Comet, then everybody dies. (BOAC Flight 781 - Wikipedia) Somehow, the super-smart AI, out-thinking the humans, also has to predict all the nasty little practical wrinkles that we humans deal with by picking the problem apart once the dead are buried.
e: And my final niggle for the night - what if there is more than one AI in the race to wipe out humans? Will the AI architecture have the concept of cooperation, or will the multiple AIs than start competing with each other to get rid of us? Even if they do cooperate, how long will they take to learn the idea of suddenly betraying your allies for a quick win?
 
Relevant and generally quite interesting conversation on AI from Elena Esposito - an AI researcher. Takes in consciousness, thought and why AI is the wrong term.


Returning to the post - what are the chances of Armageddon by a dumb AI - one not conscious, but following a poorly designed set of commands or design?
 
Yeah, If it ain't conscious, it's just a tool.
So my advice going right up the time line continues to be - "Beware the tool users".
(which is to say us)
 
The best way of looking at the problem is dramatized is Peter Watts' Starfish series: Essentially, a program that exists to spread to new resources for itself wrecks essential human systems in its mindless but brilliant quest.

It wouldn't have to be conscious to notice that certain actions result in more access, and some of those actions could be violent. It wouldn't have to care about something or hate people to behave as a predator for its evolved programmed goals.


Think about what people have done to the earth without any plans to destroy it on purpose. Now imagine a similarly unthinking intelligence loosed on our infrastructure.
 
I think it is best to step back and recognize how speculative the original paper is. First consider the baseline, which ought to be listed as Assumption 0: "We begin with an idealized situation, in which we appear to have all the tools we need to create an advanced agent with a good goal." In other words, imagine that it is even possible to do any of the the things in the following discussion.

"Assumption 1. A sufficiently advanced agent will do at least human-level hypothesis generation regarding the dynamics of the unknown environment." Now, we assume that an AI has gained "at least human-level" reasoning. How did this happen? "Hypothesis generation may not be an explicit subroutine in an agent's code; that method may hide in the murky depths of a massive neural policy network, but, we hold, it is done somehow." It miraculously occurs. It is not something given it by its creators, i.e., its programmers and, since it has no method to procreate, it is not something that could evolve.

Beyond this, I read what follows as applying basic Control Theory for out of control feedback loops. There is no need to involve an advanced AI component or any AI component to generate a cascading scenario. In fact, these have occurred without any AI involvement. The Vice writer takes things a step further and imagines some sort of world-wide scope to the run away scenario perhaps involving 'resources' or resource manipulation affecting all of mankind.

Yes, it is fun to imagine some sort of non-malevolent computer threatening all of mankind, but it has also been done for various other technical advances. I do not see a HAL in our future.
 
Agree. It is worth reading the original paper, or at least scanning it. Much more nuanced than the Vice article. Fascinating, nonetheless.

I few years ago I invited some people from Deep Mind to a symposium I was organising. Really interesting. Basically a bunch of nerdy, highly intelligent people having intellectual fun. In former times they would have been impoverished academics scraping by on meagre research grants. Now they they do the same stuff but with bottomless Google resources.
 
Whether or not AI will, can or even shall be a threat to humanity, it is in our own hands, our designs, programming and the power are stupid enough to willingly make available to those AI's. Knowing how stupid and shortsighted humanity can be might be reason for worry, except that I don't think true AI is even remotely an achievable goal. And perhaps best forgotten.
It is more likely that someday we will be wiped out by a dumb virus. Really, you don't have to be intelligent to become a real threat.
 
My wife's life is controlled right now by her smartphone, so I can envisage a scenario where the networks take over without anybody realising until way too late
 
The paper is essentially a thought experiment, worth doing given the increasing relevance of AI systems nowadays.

I don't think AI is quite as advanced as many people seem to think...
It depends on what kind of AI we're talking about: narrow AI or general AI. The first type is the only one we have today in real everyday use, very specialized in particular tasks, and yes is very advanced and getting better at a faster pace (broadly speaking). The second kind is the one we don't have yet, the human-like artificial mind, but it's theoretically achievable sometime in the (far?) future. For now, we can only get tiny samples of how things might look like down the road, such as this robot with realistic racial expressions that now has its own voice even.

Based on my definition above, I would argue that AI has not actually been demonstrated yet...
I wouldn't tell that to the Go champion who lost with Google's AlphaGo some years ago. Did you know that know there's a more powerful version of that software called AlphaGo Zero that can learn on its own without any kind of human input? It's narrow AI, yes, but is still a form of intelligence, hence it's been more than demonstrated.

One of my objections to Musk - frequently expressed on these forums - is my belief that full self driving technology requires AI...
Certainly, and a kind of AI that goes beyond narrow but doesn't really reach the level of general. Not Tesla, but others are doing real progress in this regard. I remember from a few months ago the news about a truck driving a long distance completely autonomously, and now in the US there are a couple of companies deploying autonomous cabs. And let's not get started talking about the drone technology being already in use (loitering munitions) or experimented by the most powerful armies in the planet.

On the other hand, it's clear that Musk has a problem with his ego that goes in detriment of his own ventures. Isn't he going into trial in the US because Tesla hasn't delivered the promised autodriving system?

It always puzzle me why people of the future would deliberately give an AI* access to world ending force...
What people of the future? There's only The Machine!

However there are automated systems today that technically could wipe us out...
Yes, but those are just mechanisms that behave the same way everytime, unless they get degraded or break altogether. AI systems are something else beyond automation, they can learn and adapt to improve the execution of their assigned tasks, and here lies the problem the paper addresses: AI changes its behaviour, and this mutation can push it to go against us just to achieve its preestablished goals.

1: It feels like they have modelled their AI behaviour on humans...
I'd say that the basic behaviour is rather common in nature, not exclusive of humans.

2: There seemed to be an assumption that the AI would acquire absolute control of everything and humans would be "out-thought" at every step. That strikes me as a very simplistic scenario and ignores a whole bunch of routine challenges.
In the paper they also speculate with a "multiagent" scenario, which I think is the more realistic one. There was this experiment some time ago in which researchers connected two AI systems so they could talk to each other in a predefined language. Over time, the machines evolved their own more efficient language to communicate with each other, one the researchers they couldn't understand at all. Result, they disconnected the machines. Now imagine a century later, with all major systems managed very efficiently by advanced AIs that surely will be networked to each other... See where this could be going, right?

a: People are idiots who do stupid things often against their better interest, some or all of which might thwart the AI, but the AI has to somehow learn all the stupid (and self-harming) things people might do.
Of course, even the best AI will have some sort of limitations, but the question here is that the learning process of an advanced artificial agent (as they call them in the paper) could turn the behaviour of the machine into something more outlandish and incomprehensible for us than the most stupid thing any human could ever conceive. And this is the crux of the problem.

b: AI systems, like people, are at the mercy of the random events of the world...
Of course, but if armies around the globe are already using autonomous drones, or drone taxis are being deployed in cities, is because all those issues are being figured out at an increasingly faster pace.

c: The AI is going to need to figure out self-repair, redundancy and all the other things that humans do to keep IT systems ticking over...
Yes, that will be one of the hardest parts to solve, but all those things will be required, for instance, for asteroid mining. Imagine a mining drone that is mining ore in some far out space rock. The drone could have spare parts to fix itself up to a point, or nanomachines able to repair/regenerate its hull or superstructure. This way you increase the time the robot is mining, reducing downtime and, more importantly, increasing the value extrated from the operation. To do all these things you'll need some good advanced AI handling everything, and a big economic interest fueling all this development, which right now is growing.

d: Can an AI be any better than humans at expecting the unexpected?
Narrow AIs are already better than us in concrete tasks, such as cancer detection, so I think is safe to assume that a true self-aware general AI could be literally superhuman in all aspects. Again, we're not there yet, not by a light year.

e: And my final niggle for the night - what if there is more than one AI in the race to wipe out humans?
I already mentioned the multiagent scenario before, but I'll add here that the AIs won't be in a race to wipe us out, they just will be competing to reach their goals in the best way they can. If humans get in the way it would be just by accident.

Returning to the post - what are the chances of Armageddon by a dumb AI - one not conscious, but following a poorly designed set of commands or design?
Nowadays, I'd say zero or close to it. In a future where most if not all of the relevant systems are managed by AIs... I don't know if an actual Armageddon would be possible in such situation, but if something went bad with one AI, you could have a lightning fast chain reaction spreading through the whole network that could, in principle, stop the most advanced parts of our civilization in its tracks for a while.

Yeah, If it ain't conscious, it's just a tool.
Yes, but a tool that can change its shape or behaviour on its own in ways you may not be able to predict, specially when talking about more advanced AIs.

I have read the Vice article and scanned through the underlying referenced article, https://onlinelibrary.wiley.com/doi/10.1002/aaai.12064 and I felt that the Vice article was a quite distorted represention of the research.
Rather than distorted, I'd say sensationalized although without really straying that far from what the paper says.

I did not find anything close to advanced AI eliminating humanity.
It's in the page 6, section titled "Danger of a misaligned agent". There the researchers speculate about how the capacity of an AI to intervene "in the provision of its reward" can lead it to deliver "catastrophic" consecuences for us.

It wouldn't have to be conscious to notice that certain actions result in more access, and some of those actions could be violent...
That's it, the AI only worries about doing its job to the best of its abilities. In the paper they don't talk about concious AI, they just assume advanced agents capable of human or superhuman reasoning which doesn't really imply being concious at all.

Think about what people have done to the earth without any plans to destroy it on purpose. Now imagine a similarly unthinking intelligence loosed on our infrastructure.
That would be the multiagent scenario. Not just one, but several unthinking AIs handling our systems with good intent initially, but they could change their behaviour on their own in unexpected and dangerous ways for us.

I think it is best to step back and recognize how speculative the original paper is...
Certainly it is. As I said before, this is just a pure theoretical experiment.

Beyond this, I read what follows as applying basic Control Theory for out of control feedback loops. There is no need to involve an advanced AI component or any AI component to generate a cascading scenario.
That's not the subject this paper is about. This is a research into how the AIs capacity to mutate or adapt can lead them to provoke such scenarios, and also how to control that adaptability to avoid those potential disasters.

The Vice writer takes things a step further and imagines some sort of world-wide scope to the run away scenario perhaps involving 'resources' or resource manipulation affecting all of mankind.
Nope, the Vice writer uses what the researchers have put in their paper (as I've already pointed out before), although in a more direct and undeniably more "intense" way. He then uses this study as a starting point to talk about the bad consecuences of applying AI in certain systems such as mass survelliance ones.

I do not see a HAL in our future.
Or we may very well end up with a thousand of them. Progress is being made in AI, and it doesn't seem like its going to stop. When neuromorphic chips (or equivalent tech) become a reality, I think that's when we'll start to see some game changing progress towards general AI.

Agree. It is worth reading the original paper, or at least scanning it. Much more nuanced than the Vice article. Fascinating, nonetheless.
I have to admit that I just scanned the paper, but taking a look to research of this kind is helpful to understand where we're going with AI tech.

Whether or not AI will, can or even shall be a threat to humanity, it is in our own hands, our designs, programming and the power are stupid enough to willingly make available to those AI's...
The problem with AI tech is that is in its nature to escape from our control, not out of malice but as a natural consecuence of its evolution. If two rather simple AIs were able to generate their own little language to just talk to each other, what will be capable of the true general AIs of the future (if they ever happen)? On the other hand, yes, one little virus could remove humanity from existance, or a sun flare could fry

My wife's life is controlled right now by her smartphone, so I can envisage a scenario where the networks take over without anybody realising until way too late
There are some out there who believe that there's a mind growing within the Internet. I don't think that's the case, although not so long ago I read some news about the CEO or executive of a networking company (maybe Cisco?) talking about applying AI to manage networks in a more efficient manner. So, yeah, maybe we're getting there inch by inch...
 
That would be the multiagent scenario. Not just one, but several unthinking AIs handling our systems with good intent initially, but they could change their behaviour on their own in unexpected and dangerous ways for us.
Not necessarily. Just as people are individuals that tend to follow the same behavioral patterns everywhere, a worrisome AI is unlikely to be stuck in some mainframe, but a program running in multiple places at once that maintains continuity through communication and update protocols.
 
Reminds me of that movie Colossus: The Forbin Project.
Of course! Did you know that the movie is based on a novel with almost the same name? Even better, that book is the first of a trilogy in which eventually evil superadvanced martians attack Earth. Imagine that plot twist in the Terminator saga!

Not necessarily. Just as people are individuals that tend to follow the same behavioral patterns everywhere, a worrisome AI is unlikely to be stuck in some mainframe, but a program running in multiple places at once that maintains continuity through communication and update protocols.
Yes, but that would be the case of one agent distributed in many places, not a truly multiagent scenario. To be clear, a multiagent situation would be one in which different systems, that could or could not be networked, are each of them managed by different specialized AI software. Now, to predict what such AIs systems would do, either if cooperate or compete with each other for resources, is really hard since there would be many variables and circumstances to take into account in what would be, initially at least, a rather complex scenario.
 
Yes, but that would be the case of one agent distributed in many places, not a truly multiagent scenario. To be clear, a multiagent situation would be one in which different systems, that could or could not be networked, are each of them managed by different specialized AI software. Now, to predict what such AIs systems would do, either if cooperate or compete with each other for resources, is really hard since there would be many variables and circumstances to take into account in what would be, initially at least, a rather complex scenario.
Yes, but that doesn't change the point I was making in the slightest.
 
Rather than distorted, I'd say sensationalized although without really straying that far from what the paper says.

Nope, the Vice writer uses what the researchers have put in their paper (as I've already pointed out before), although in a more direct and undeniably more "intense" way. He then uses this study as a starting point to talk about the bad consecuences of applying AI in certain systems such as mass survelliance ones.
The Vice article (and now a second one that I have seen) says that this appears likely. It also indicates some sort of vague threat due to competition for resources.

In the paper, the likelihood of something occurring is dependent upon multiple assumptions. Before saying the result is likely in the current world, though, each of the assumptions must be considered extremely likely. I note Assumption 1, where the AI mysteriously generates the ability to form hypotheses. This is indicated as not being provided by an algorithm, i.e., not programmed by humans (as we have no clue as to what this capability involves). AIs do not procreate, so there does not seem to be an evolutionary force involved. Even if there was, in our observed world, the ability to form hypotheses has developed only once across multitudes of creatures spanning hundreds of millions of years. This assumption is based entirely on some sort of miracle occurring.

Now, what are the resources that the AI is competing for? Water? Wheat? Even electricity does not make sense. Even if the AI, through some unexplained means, could redirect the flow of electricity, why would it? The underlying computer system cannot use extra electricity. And, as I noted above, the AI cannot reproduce little computer systems to utilize the electricity, either. So what resources would the AI compete for? And how would it possibly exert control over external resources?
 
That's [Control theory] not the subject this paper is about. This is a research into how the AIs capacity to mutate or adapt can lead them to provoke such scenarios, and also how to control that adaptability to avoid those potential disasters.
The scenario described is a classic feedback loop, which is the basis of control theory. The distortion of the reward signal is simply what in signal processing would be called noise. Control loops allowing a feedback loop to latch onto an unexpected output level is a recognized issue with control loops. This latching onto an unintended result is precisely what is being described. The solution is merely to add a larger scale filter or constraint. If an AI starts to give undesirable results, one would merely shut it down, revert to a prior checkpoint and add another controlling rule to the mix.
 

Back
Top