Appendix P: Why We Should Defer the Creation of Autonomous, Human-Level Forms of AI

Let us imagine, for the sake of argument, that scientists succeed a few decades from now in building the world’s first anthropomorphically-designed robot possessing a form of human-level intelligence.  Because it has been designed to emulate human rational processes and to possess human-like motivations, the android very soon asks itself: “How can I improve my baseline performance profile, extending my capabilities beyond their current parameters?”  This would be an entirely reasonable project for the robot to launch: it would simply be emulating the processes of self-improvement (such as physical fitness programs or advanced education) that most humans regularly undertake throughout their lives.  The robot accordingly analyses its own physical systems and software design, actively seeking out areas for potential upgrading.  It identifies several dozen such areas, and immediately proceeds with the redesign and reconfiguration of those physical and software systems.  As a result, the robot becomes noticeably smarter and more capable than it was before.

At this point it repeats the original question: “Now, based on my newly augmented performance profile, am I able to discern any other areas in which my capabilities might be further perfected?”  It performs a second (more sophisticated) iteration of self-analysis, followed by a second round of self-redesign and self-reconfiguration.  It becomes even more capable and smarter still.  Then it goes through the process a third time, analyzing itself and reconfiguring itself yet again.  This time the process takes place more quickly, because the robot has become far smarter and more dexterous than it was before: the third iteration takes only about half as long as the first one had.

And so it goes, iteration after iteration, continually redesigning itself and rebuilding itself, until its body has come to look quite different from its original state, and its mental capabilities greatly exceed their original parameters.  The robot sees no reason to stop going through this cyclical process, since its overall performance profile is rapidly expanding, and it has been designed to regard this kind of functional self-optimization as a highly worthwhile goal.  The iterations continue, increasing in speed and scope as the machine grows in power, capabilities, and intelligence.  The process of self-transformation accelerates.  Within a matter of weeks, the robot is no longer recognizable.  Independently of human intervention, it has turned itself into a completely new kind of sentient being – immensely powerful, overwhelmingly intelligent.  It has left its human designers in the dust, and the process shows no sign of stopping.  It is now morphing and re-morphing itself in cycles that repeat every few hours, accelerating into entirely new forms of embodiment and sentience.

I offer this scenario of runaway self-modification as an example of a particular class of problem, which I will call the paradox of autonomous intelligent agency. (1) It is a problem that would be posed by the creation of advanced entities whose profile of capabilities includes sentience, intelligence, self-awareness, motivational impulses, and the ability to engage in purposive actions in the physical world.  The paradox lies in the fact that such entities –if we bring them into existence – will all have been created through human design, yet will all have the potential to break completely free from human control.  They are human artifacts, but they are capable of evolving on their own, in some cases assuming capabilities that dwarf the powers of their creators.  They would therefore be fundamentally unpredictable, at an ontological level: we have no way of knowing, or of controlling, what sorts of beings they will make of themselves.

In many cases, this could yield astonishing, wonderful results.  We humans might find ourselves in the position of proud parents, beaming with awe at the amazing achievements of their offspring.  Unfortunately, however, the paradox of autonomous intelligent agency also entails profound and unavoidable risks.  We are envisioning here the possible emergence of new classes of entities, some of which will have a will of their own, and many of which will be unpredictable, uncontrollable, and immensely powerful.  One does not have to be paranoid to envisage a less than felicitous outcome.

Imagine, for example, a few million highly-intelligent robot bushes moving about in our midst (see the description of Moravec’s bush robots in Chapter 5).  They may turn out to be benign in their actions toward us, or indifferent, or harmful.  Since their non-anthropomorphic personhood will be so profoundly different from ours, we have few grounds for believing that they will necessarily share our ethical standards for proper treatment of other creatures and species.  They may cherish us and treat us with great benevolence; they may be oblivious toward us, just as we are toward the microbes in our intestinal system; they may regard our bodies as convenient sources of trace elements to be mined like a vein of ore in a hillside.  The plain fact is that we will have no way of knowing what they are like, until we have actually created them and interacted with them.  By then, of course, if things go badly, it will be much too late.

The most startling aspect of this paradox is that it is not merely speculative or hypothetical in nature: our society is currently working very hard to make it a reality.  We are devoting immense resources to precisely the sorts of scientific and engineering enterprises that could eventually result in the creation of these kinds of autonomous, intelligent beings.  Our informatics and robotics industries are aiming in this general direction; so is the emerging enterprise of human biological enhancement.

Some transhumanist and futurist thinkers have given considerable thought to this paradox, and have proposed a variety of solutions to it. (2) They say: We should design our creations with failsafe programs that allow us to shut them down if they spin out of control.  We should irreversibly engineer certain kinds of morality into our creations.  We should constrain our creations using ironclad rules of programmed behavior, along the lines of Isaac Asimov’s Three Laws of Robotics:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws. (3)

Unfortunately, all these strategies for maintaining control over our creations rest on assumptions that beg the question of genuine autonomy.  Either the sentient being we have created possesses autonomous intelligent agency – or it does not.  If it does not, then it is at bottom a sophisticated automaton lacking in the qualities that comprise non-anthropomorphic personhood.  Its range of actions and capabilities might still be impressive, but that range would pale compared to the performance profile of an autonomous human person.  If, on the other hand, it is truly intelligent and genuinely autonomous, it would be capable of taking initiative on its own, and would no doubt eventually devise ingenious means to circumvent our efforts at control.  It would bypass the failsafe systems; revise its pre-programmed inner moral code as it saw fit; and redefine or reinterpret the terms of the Three Laws of Robotics in a manner that allowed it full freedom of action. (4) After all,  that is precisely one aspect of what intelligence is all about: figuring out innovative ways to make the world do what you want it to do.

We cannot have it both ways, in short.  Either we create non-autonomous entities that are deliberately endowed with rigidly-constrained functionality – and we thereby maintain a measure of firm control over them; or we unleash an entirely new class of potent and autonomous beings into the world.  In the former case, the powers of our machine creations will necessarily remain limited, and their utility to us humans will also be commensurately limited.  In the latter case, their potential utility to us humans will be much greater, since their range of behaviors and thoughts will be potentially infinite.  But they will also be potentially far more dangerous, perhaps even posing an existential threat to humankind.  If we follow the Precautionary Principle, described in Chapter 18, then we should defer indefinitely the creation of such potentially catastrophic inventions.  We should not create such entities until we are completely certain that they would not prove harmful.


(1) The concept of runaway AI has been amply explored both by sci-fi writers and by theorists of AI and robotics.  See the books in the Bibliography section on Robots and Artificial Intelligence, particularly those by Bostrom, Barrat, De Garis, Hall, Wallach, and Wilson.

(2) See the thoughtful discussion of these issues by the AI expert Stuart Russell in an interview for Quanta magazine: Natalie Wolchover, “Concerns of an Artificial Intelligence Pioneer,” Quanta (April 21, 2015):

(3) Isaac Asimov, I, Robot (Bantam, 1991)

(4) See Wallach and Allen, Moral Machines; Hall, Beyond AI; and Daniel Wilson, How to Survive a Robot Uprising: Tips on Defending Yourself Against the Coming Rebellion (Bloomsbury, 2005).