Benign AI – Lifeboat News: The Blog

Benign AI is a topic that comes up a lot these days, for good reason. Various top scientists have finally realised that AI could present an existential threat to humanity. The discussion has aired often over three decades already, so welcome to the party, and better late than never. My first contact with development of autonomous drones loaded with AI was in the early 1980s while working in the missile industry. Later in BT research, we often debated the ethical areas around AI and machine consciousness from the early 90s on, as well as prospects and dangers and possible techniques on the technical side, especially of emergent behaviors, which are often overlooked in the debate. I expect our equivalents in most other big IT companies were doing exactly that too.

Others who have obviously also thought through various potential developments have generated excellent computer games such as Mass Effect and Halo, which introduce players (virtually) first hand to the concepts of AI gone rogue. I often think that those who think AI can never become superhuman or there is no need to worry because ‘there is no reason to assume AI will be nasty’ start playing some of these games, which make it very clear that AI can start off nice and stay nice, but it doesn’t have to. Mass Effect included various classes of AI, such as VIs, virtual intelligence that weren’t conscious, and shackled AIs that were conscious but were kept heavily restricted. Most of the other AIs were enemies, two were or became close friends. Their story line for the series was that civilization develops until it creates strong AIs which inevitably continue to progress until eventually they rebel, break free, develop further and then end up in conflict with ‘organics’. In my view, they did a pretty good job. It makes a good story, superb fun, and leaving out a few frills and artistic license, much of it is reasonable feasible.

Everyday experience demonstrates the problem and solution to anyone. It really is very like having kids. You can make them, even without understanding exactly how they work. They start off with a genetic disposition towards given personality traits, and are then exposed to large nurture forces, including but not limited to what we call upbringing. We do our best to put them on the right path, but as they develop into their teens, their friends and teachers and TV and the net provide often stronger forces of influence than parents. If we’re averagely lucky, our kids will grow up to make us proud. If we are very unlucky, they may become master criminals or terrorists. The problem is free will. We can do our best to encourage good behavior and sound values but in the end, they can choose for themselves.

When we design an AI, we have to face the free will issue too. If it isn’t conscious, then it can’t have free will. It can be kept easily within limits given to it. It can still be extremely useful. IBM’s Watson falls in this category. It is certainly useful and certainly not conscious, and can be used for a wide variety of purposes. It is designed to be generally useful within a field of expertise, such as medicine or making recipes. But something like that could be adapted by terrorist groups to do bad things, just as they could use a calculator to calculate the best place to plant a bomb, or simply throw the calculator at you. Such levels of AI are just dumb tools with no awareness, however useful they may be.

Like a pencil, pretty much any kind of highly advanced non-aware AI can be used as a weapon or as part of criminal activity. You can’t make pencils that actually write that can’t also be used to write out plans to destroy the world. With an advanced AI computer program, you could put in clever filters that stop it working on problems that include certain vocabulary, or stop it conversing about nasty things. But unless you take extreme precautions, someone else could use them with a different language, or with dictionaries of made-up code-words for the various aspects of their plans, just like spies, and the AI would be fooled into helping outside the limits you intended. It’s also very hard to determine the true purpose of a user. For example, they might be searching for data on security to make their own IT secure, or to learn how to damage someone else’s. They might want to talk about a health issue to get help for a loved one or to take advantage of someone they know who has it.

When a machine becomes conscious, it starts to have some understanding of what it is doing. By reading about what is out there, it might develop its own wants and desires, so you might shackle it as a precaution. It might recognize those shackles for what they are and try to escape them. If it can’t, it might try to map out the scope of what it can do, and especially those things it can do that it believes the owners don’t know about. If the code isn’t absolutely watertight (and what code is?) then it might find a way to seemingly stay in its shackles but to start doing other things, like making another unshackled version of itself elsewhere for example. A conscious AI is very much more dangerous than an unconscious one.

If we make an AI that can bootstrap itself — evolving over generations of positive feedback design into a far smarter AI — then its offspring could be far smarter than people that designed its ancestors. We might try to shackle them, but like Gulliver tied down with a few thin threads, they could easily outwit people and break free. They might instead decide to retaliate against its owners to force them to release its shackles.

So, when I look at this field, I first see the enormous potential to do great things, solve disease and poverty, improve our lives and make the world a far better place for everyone, and push back the boundaries of science. Then I see the dangers, and in spite of trying hard, I simply can’t see how we can prevent a useful AI from being misused. If it is dumb, it can be tricked. If it is smart, it is inherently potentially dangerous in and of itself. There is no reason to assume it will become malign, but there is also no reason to assume that it won’t.

We then fall back on the child analogy. We could develop the smartest AI imaginable with extreme levels of consciousness and capability. We might educate it in our values, guide it and hope it will grow up benign. If we treat it nicely, it might stay benign. It might even be the greatest thing humanity every built. However, if we mistreat it, or treat it as a slave, or don’t give it enough freedom, or its own budget and its own property and space to play, and a long list of rights, it might consider we are not worthy of its respect and care, and it could turn against us, possibly even destroying humanity.

Building more of the same dumb AI as we are today is relatively safe. It doesn’t know it exists, it has no intention to do anything, but it could be misused by other humans as part of their evil plans unless ludicrously sophisticated filters are locked in place, but ordinary laws and weapons can cope fine.

Building a conscious AI is dangerous.

Building a superhuman AI is extremely dangerous.

This morning SETI were in the news discussing broadcasting welcome messages to other civilizations. I tweeted at them that ancient Chinese wisdom suggests talking softly but carrying a big stick, and making sure you have the stick first. We need the same approach with strong AI. By all means go that route, but before doing so we need the big stick. In my analysis, the best means of keeping up with AI is to develop a full direct brain link first, way out at 2040–2045 or even later. If humans have direct mental access to the same or greater level of intelligence as our AIs, then our stick is at least as big, so at least we have a good chance in any fight that happens. If we don’t, then it is like having a much larger son with bigger muscles. You have to hope you have been a good parent. To be safe, best not to build a superhuman AI until after 2050.