Penguin Press National Pub date: February 19, 2019 Title: DEEP THINKING Subtitle: 7wenty-Five Ways of Looking at AI By: John Brockman Length: 90,000 words Headline: Science world luminary John Brockman assembles twenty-five of the most important scientific minds, people who have been thinking about the field artificial intelligence for most of their careers for an unparalleled round-table examination about mind, thinking, intelligence and what it means to be human. Description: "Artificial intelligence is today's story—the story behind all other stories. It is the Second Coming and the Apocalypse at the same time: Good AI versus evil AI." —John Brockman More than sixty years ago, mathematician-philosopher Norbert Wiener published a book on the place of machines in society that ended with a warning: “we shall never receive the right answers to our questions unless we ask the right questions.... The hour is very late, and the choice of good and evil knocks at our door.” In the wake of advances in unsupervised, self-improving machine learning, a small but influential community of thinkers is considering Wiener’s words again. In Deep Thinking, John Brockman gathers their disparate visions of where AI might be taking us. The fruit of the long history of Brockman’s profound engagement with the most important scientific minds who have been thinking about AI—from Alison Gopnik and David Deutsch to Frank Wilczek and Stephen Wolfram— Deep Thinking 1s an ideal introduction to the landscape of crucial issues AI presents. The collision between opposing perspectives is salutary and exhilarating; some of these figures, such as computer scientist Stuart Russell, Skype co-founder Jaan Tallinn, and physicist Max Tegmark, are deeply concerned with the threat of AI, including the existential one, while others, notably robotics entrepreneur Rodney Brooks, philosopher Daniel Dennett, and bestselling author Steven Pinker, have a very different view. Serious, searching and authoritative, Deep Thinking lays out the intellectual landscape of one of the most important topics of our time. HOUSE_OVERSIGHT_016804
Participants in The Deep Thinking Project Chris Anderson is an entrepreneur; a roboticist; former editor-in-chief of Wired; co- founder and CEO of 3DR; and author of Zhe Long Tail, Free, and Makers. Rodney Brooks is a computer scientist; Panasonic Professor of Robotics, emeritus, MIT; former director, MIT Computer Science Lab; and founder, chairman, and CTO of Rethink Robotics. He is the author of //esh and Machines. George M. Church is Robert Winthrop Professor of Genetics at Harvard Medical School; Professor of Health Sciences and Technology, Harvard-MIT; and co-author (with Ed Regis) of Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves. Daniel C. Dennett is University Professor and Austin B. Fletcher Professor of Philosophy and director of the Center for Cognitive Studies at Tufts University. He is the author of a dozen books, including Consciousness Explained and, most recently, From Bacteria to Bach and Back: The Evolution of Minds. David Deutsch is a quantum physicist and a member of the Centre for Quantum Computation at the Clarendon Laboratory, Oxford University. He is the author of Zhe Fabric of Reality and The Beginning of Infinity. Anca Dragan is an assistant professor in the Department of Electrical Engineering and Computer Sciences at UC Berkeley. She co-founded and serves on the steering committee for the Berkeley AI Research (BAIR) Lab and is a co-principal investigator in Berkeley’s Center for Human-Compatible AI. George Dyson is a historian of science and technology and the author of Baidarka: the Kayak, Darwin Among the Machines, Project Orion, and Turing ’s Cathedral. Peter Galison is a science historian, Joseph Pellegrino University Professor and co- founder of the Black Hole Initiative at Harvard University, and the author of Einstein's Clocks and Poincaré ’s Maps: Empires of Time. Neil Gershenfeld is a physicist and director of MIT’s Center for Bits and Atoms. He is the author of AB, co-author (with Alan Gershenfeld & Joel Cutcher-Gershenfeld) of Designing Reality, and founder of the global fab lab network. Alison Gopnik is a developmental psychologist at UC Berkeley; her books include The Philosophical Baby and, most recently, Zhe Gardener and the Carpenter: What the New Science of Child Development Tells Us About the Relationship Between Parents and Children. HOUSE_OVERSIGHT_016805
Tom Griffiths is Henry R. Luce Professor of Information, Technology, Consciousness, and Culture at Princeton University. He is co-author (with Brian Christian) of Algorithms to Live By. W. Daniel “Danny” Hillis is an inventor, entrepreneur, and computer scientist, Judge Widney Professor of Engineering and Medicine at USC, and author of The Pattern on the Stone: The Simple Ideas That Make Computers Work. Caroline A. Jones is a professor of art history in the Department of Architecture at MIT and author of Eyesight Alone: Clement Greenberg’s Modernism and the Bureaucratization of the Senses; Machine in the Studio: Constructing the Postwar American Artist; and The Global Work of Art. David Kaiser is Germeshausen Professor of the History of Science and professor of physics at MIT, and head of its Program in Science, Technology & Society. He is the author of How the Hippies Saved Physics: Science, Counterculture, and the Quantum Revival and American Physics and the Cold War Bubble (forthcoming). Seth Lloyd is a theoretical physicist at MIT, Nam P. Suh Professor in the Department of Mechanical Engineering, and an external professor at the Santa Fe Institute. He is the author of Programming the Universe: A Quantum Computer Scientist Takes on the Cosmos. Hans Ulrich Obrist is artistic director of the Serpentine Gallery, London, and the author of Ways of Curating and Lives of the Artists, Lives of the Architects. Judea Pearl is professor of computer science and director of the Cognitive Systems Laboratory at UCLA. His most recent book, co-authored with Dana Mackenzie, is Zhe Book of Why: The Alex “Sandy” Pentland is Toshiba Professor and professor of media arts and sciences, MIT; director of the Human Dynamics and Connection Science labs and the Media Lab Entrepreneurship Program, and the author of Social Physics. New Science of Cause and Effect. Steven Pinker, a Johnstone Family Professor in the Department of Psychology at Harvard University, is an experimental psychologist who conducts research in visual cognition, psycholinguistics, and social relations. He is the author of eleven books, including The Blank Slate, The Better Angels of Our Nature, and, most recently, Enlightenment Now: The Case for Reason, Science, Humanism, and Progress. Venki Ramakrishnan is a scientist at the Medical Research Council Laboratory of Molecular Biology, Cambridge University; recipient of the Nobel Prize in Chemistry (2009); current president of the Royal Society; and the author of Gene Machine: The Race to Discover the Secrets of the Ribosome. HOUSE_OVERSIGHT_016806
Stuart Russell is a professor of computer science and Smith-Zadeh Professor in Engineering at UC Berkeley. He is the coauthor (with Peter Norvig) of Artificial Intelligence: A Modern Approach. Jaan Tallin, a computer programmer, theoretical physicist, and investor, 1s a co- developer of Skype and Kazaa. Max Tegmark is an MIT physicist and AI researcher; president of the Future of Life Institute; scientific director of the Foundational Questions Institute; and the author of Our Mathematical Universe and Life 3.0: Being Human in the Age of Artificial Intelligence. Frank Wilczek is Herman Feshbach Professor of Physics at MIT, recipient of the 2004 Nobel Prize in physics, and the author of A Beautiful Question: Finding Nature’s Deep Design. Stephen Wolfram 1s a scientist, inventor, and the founder and CEO of Wolfram Research. He is the creator of the symbolic computation program Mathematica and its programming language, Wolfram Language, as well as the knowledge engine Wolfram|Alpha. He is also the author of A New Kind of Science. HOUSE_OVERSIGHT_016807
Deep Thinking Twenty-five Ways of Looking at AI edited by John Brockma Penguin Press — February 19, 2019 HOUSE_OVERSIGHT_016808
Table of Contents Acknowledgments Introduction: On the Promise and Peril of AI by John Brockman Seth Lloyd: Wrong, but More Relevant Than Ever It is exactly in the extension of the cybernetic idea to human beings that Wiener’s conceptions missed their target. Judea Pearl: The Limitations of Opaque Learning Machines Deep learning has its own dynamics, it does its own repair and its own optimization, and it gives you the right results most of the time. But when it doesn’t, you don’t have a clue about what went wrong and what should be fixed. Stuart Russell: The Purpose Put Into the Machine We may face the prospect of superintelligent machines—their actions by definition unpredictable by us and their imperfectly specified objectives conflicting with our own— whose motivation to preserve their existence in order to achieve those objectives may be insuperable. George Dyson: The Third Law Any system simple enough to be understandable will not be complicated enough to behave intelligently, while any system complicated enough to behave intelligently will be too complicated to understand. Daniel C. Dennett: What Can We Do? We don’t need artificial conscious agents. We need intelligent tools. Rodney Brooks: The Inhuman Mess Our Machines Have Gotten Us Into We are ina much more complex situation today than Wiener foresaw, and I am worried that it is much more pernicious than even his worst imagined fears. Frank Wilczek: The Unity of Intelligence The advantages of artificial over natural intelligence appear permanent, while the advantages of natural over artificial intelligence, though substantial at present, appear transient. Max Tegmark: Let’s Aspire to More Than Making Ourselves Obsolete We should analyze what could go wrong with AI to ensure that it goes right. Jaan Tallinn: Dissident Messages Continued progress in AI can precipitate a change of cosmic proportions—a runaway process that will likely kill everyone. HOUSE_OVERSIGHT_016809
Steven Pinker: Tech Prophecy and the Underappreciated Causal Power of Ideas There is no law of complex systems that says that intelligent agents must turn into ruthless megalomaniacs. David Deutsch: Beyond Reward and Punishment Misconceptions about human thinking and human origins are causing corresponding misconceptions about AGI and how it might be created. Tom Griffiths: The Artificial Use of Human Beings Automated intelligent systems that will make good inferences about what people want must have good generative models for human behavior. Anca Dragan: Putting the Human into the AI Equation In the real world, an AI must interact with people and reason about them. People will have to formally enter the AI problem definition somewhere. Chris Anderson: Gradient Descent Just because Al systems sometimes end up in local minima, don’t conclude that this makes them any less like life. Humans—indeed, probably all life-forms—are often stuck in local minima. David Kaiser: “Information” for Wiener, for Shannon, and for Us Many of the central arguments in The Human Use of Human Beings seem closer to the 19th century than the 21st. Wiener seems not to have fully embraced Shannon’s notion of information as consisting of irreducible, meaning-free bits. Neil Gershenfeld: Scaling Although machine making and machine thinking might appear to be unrelated trends, they lie in each other’s futures. W. Daniel Hillis: The First Machine Intelligences Hybrid superintelligences such as nation states and corporations have their own emergent goals and their actions are not always aligned to the interests of the people who created them. Venki Ramakrishnan: Will Computers Become Our Overlords? Our fears about AI reflect the belief that our intelligence is what makes us special. Alex “Sandy” Pentland: The Human Strategy How can we make a good human-artificial ecosystem, something that’s not a machine society but a cyberculture in which we can all live as humans—a culture with a human feel to it? HOUSE_OVERSIGHT_016810
Hans Ulrich Obrist: Making the Invisible Visible: Art Meets AI Many contemporary artists are articulating various doubts about the promises of AI and reminding us not to associate the term “artificial intelligence” solely with positive outcomes. Alison Gopnik: AlIs versus Four-Year-Olds Looking at what children do may give programmers useful hints about directions for computer learning. Peter Galison: Algorists Dream of Objectivity By now, the legal, ethical, formal, and economic dimensions of algorithms are all quasi- infinite. George M. Church: The Rights of Machines Probably we should be less concerned about us-versus-them and more concerned about the rights of all sentients in the face of an emerging unprecedented diversity of minds. Caroline A. Jones: The Artistic Use of Cybernetic Beings The work of cybernetically inclined artists concerns the emergent behaviors of life that elude AI in its current condition. Stephen Wolfram: Artificial Intelligence and the Future of Civilization The most dramatic discontinuity will surely be when we achieve effective human immortality. Whether this will be achieved biologically or digitally isn’t clear, but inevitably it will be achieved. HOUSE_OVERSIGHT_016811
Introduction: On the Promise and Peril of AI John Brockman Artificial intelligence is today’s story—the story behind all other stories. It is the Second Coming and the Apocalypse at the same time: Good AI versus evil AI. This book comes out of an ongoing conversation with a number of important thinkers, both in the world of AI and beyond it, about what AI is and what it means. Called the Deep Thinking Project, this conversation began in earnest in September 2016, in a meeting at the Mayflower Grace Hotel in Washington, Connecticut with some of the book’s contributors. What quickly emerged from that first meeting is that the excitement and fear in the wider culture surrounding AI now has an analogue in the way Norbert Wiener’s ideas regarding “cybernetics” worked their way through the culture, particularly in the 1960’s, as artists began to incorporate thinking about new technologies into their work. I witnessed the impact of those ideas at close hand; indeed it’s not too much to say they set me off on my life’s path. With the advent of the digital era beginning in the early 1970s, people stopped talking about Wiener, but today, his Cybernetic Idea has been so widely adopted that it’s internalized to the point where it no longer needs a name. It’s everywhere, it’s in the air, and it’s a fitting a place to begin. New Technologies=New Perceptions Before AI, there was Cybernetics—the idea of automatic, self-regulating control, laid out in Norbert Wiener’s foundational text of 1948. I can date my own serious exposure to it to 1966, when the composer John Cage invited me and four or five other young arts people to join him for a series of dinners—an ongoing seminar about media, communications, art, music, and philosophy that focused on Cage’s interest in the ideas of Wiener, Claude Shannon, and Marshall McLuhan, all of whom had currency in the New York art circles in which I was then moving. In particular, Cage had picked up on McLuhan’s idea that by inventing electronic technologies we had externalized our central nervous system—that is, our minds—and that we now had to presume that “there’s only one mind, the one we all share.” Ideas of this nature were beginning to be of great interest to the artists I was working with in New York at the Film-Makers’ Cinémathéque, where I was program manager for a series of multimedia productions called the New Cinema | (also known as the Expanded Cinema Festival), under the auspices of avant-garde filmmaker and impresario Jonas Mekas. They included visual artists Claes Oldenburg, Robert Rauschenberg, Andy Warhol, Robert Whitman; kinetic artists Charlotte Moorman and Nam June Paik; happenings artists Allan Kaprow and Carolee Schneemann; dancer Tricia Brown; filmmakers Jack Smith, Stan Vanderbeek, Ed Emshwiller, and the Kuchar brothers; avant-garde dramatist Ken Dewey; poet Gerd Stern and the USCO group; minimalist musicians Lamonte Young and Terry Riley; and through Warhol, the music group, The Velvet Underground. Many of these people were reading Wiener, and cybernetics was in the air. It was at one of these dinners that Cage reached into his briefcase and took out a copy of Cybernetics and handed it to me, saying, “This is for 99 you. HOUSE_OVERSIGHT_016812
During the Festival, I received an unexpected phone call from Wiener’s colleague Arthur K. Solomon, head of Harvard’s graduate program in biophysics. Wiener had died the year before, and Solomon and Wiener’s other close colleagues at MIT and Harvard had been reading about the Expanded Cinema Festival in the New York Times and were intrigued by the connection to Wiener’s work. Solomon invited me to bring some of the artists up to Cambridge to meet with him and a group that included MIT sensory- communications researcher Walter Rosenblith, Harvard applied mathematician Anthony Oettinger, and MIT engineer Harold “Doc” Edgerton, inventor of the strobe light. Like many other “art meets science” situations I’ve been involved in since, the two-day event was an informed failure: ships passing in the night. But I took it all onboard and the event was consequential in some interesting ways—one of which came from the fact that they took us to see “the” computer. Computers were a rarity back then; at least, none of us on the visit had ever seen one. We were ushered into a large space on the MIT campus, in the middle of which there was a “cold room” raised off the floor and enclosed in glass, in which technicians wearing white lab coats, scarves, and gloves were busy collating punch cards coming through an enormous machine. When I approached, the steam from my breath fogged up the window into the cold room. Wiping it off, I saw “the” computer. I fell in love. Later, in the Fall of 1967, I went to Menlo Park to spend time with Stewart Brand, whom I had met in New York in 1965 when he was a satellite member of the USCO group of artists. Now, with his wife Lois, a mathematician, he was preparing the first edition of The Whole Earth Catalog for publication. While Lois and the team did the heavy lifting on the final mechanicals for WEC, Stewart and I sat together in a corner for two days, reading, underlining, and annotating the same paperback copy of Cybernetics that Cage had handed to me the year before, and debating Wiener’s ideas. Inspired by this set of ideas, I began to develop a theme, a mantra of sorts, that has informed my endeavors since: “new technologies = new perceptions.” Inspired by communications theorist Marshall McLuhan, architect-designer Buckminster Fuller, futurist John McHale, and cultural anthropologists Edward T. (Ned) Hall and Edmund Carpenter, I started reading avidly in the field of information theory, cybernetics, and systems theory. McLuhan suggested I read biologist J.Z. Young’s Doubt and Certainty in Science in which he said that we create tools and we mold ourselves through our use of them. The other text he recommended was Warren Weaver and Claude Shannon’s 1949 paper “Recent Contributions to the Mathematical Theory of Communication,” which begins: “The word communication will be used here in a very broad sense to include all of the procedures by which one mind may affect another. This, of course, involves not only written and oral speech, but also music, the pictorial arts, the theater, the ballet, and in fact all human behavior." Who knew that within two decades of that moment we would begin to recognize the brain as a computer? And in the next two decades, as we built our computers into the Internet, that we would begin to realize that the brain is not a computer, but a network of computers? Certainly not Wiener, a specialist in analogue feedback circuits designed to control machines, nor the artists, nor, least of all, myself. “We must cease to kiss the whip that lashes us.” 10 HOUSE_OVERSIGHT_016813
Two years after Cybernetics, in 1950, Norbert Wiener published Zhe Human Use of Human Beings—a deeper story, in which he expressed his concerns about the runaway commercial exploitation and other unforeseen consequences of the new technologies of control. I didn’t read The Human Use of Human Beings until the spring of 2016, when I picked up my copy, a first edition, which was sitting in my library next to Cybernetics. What shocked me was the realization of just how prescient Wiener was in 1950 about what’s going on today. Although the first edition was a major bestseller—and, indeed, jump-started an important conversation—under pressure from his peers Wiener brought out a revised and milder edition in 1954, from which the original concluding chapter, “Voices of Rigidity,” 1s conspicuously absent. Science historian George Dyson points out that in this long-forgotten first edition, Wiener predicted the possibility of a “threatening new Fascism dependent on the machine a gouverner’’: No elite escaped his criticism, from the Marxists and the Jesuits (“all of Catholicism is indeed essentially a totalitarian religion”) to the FBI (“our great merchant princes have looked upon the propaganda technique of the Russians, and have found that it is good”) and the financiers lending their support “to make American capitalism and the fifth freedom of the businessman supreme throughout the world.” Scientists... received the same scrutiny given the Church: “Indeed, the heads of great laboratories are very much like Bishops, with their association with the powerful in all walks of life, and the dangers they incur of the carnal sins of pride and of lust for power.” This jeremiad did not go well for Wiener. As Dyson puts it: These alarms were discounted at the time, not because Wiener was wrong about digital computing but because larger threats were looming as he completed his manuscript in the fall of 1949. Wiener had nothing against digital computing but was strongly opposed to nuclear weapons and refused to join those who were building digital computers to move forward on the thousand-times-more-powerful hydrogen bomb. Since the original of The Human Use of Human Beings is now out of print, lost to us is Wiener’s cri de coeur, more relevant today than when he wrote it, sixty-eight years ago: “We must cease to kiss the whip that lashes us.” Mind, Thinking, Intelligence Among the reasons we don’t hear much about “Cybernetics” today, two are central: First, although Zhe Human Use of Human Beings was considered an important book in its time, it ran counter to the aspirations of many of Wiener’s colleagues, including John von Neumann and Claude Shannon, who were interested in the commercialization of the new technologies. Second, computer pioneer John McCarthy disliked Wiener and refused to use Wiener’s term “Cybernetics.” McCarthy, in turn, coined the term “artificial intelligence” and became a founding father of that field. 11 HOUSE_OVERSIGHT_016814
As Judea Pearl, who, in the 1980s, introduced a new approach to artificial intelligence called Bayesian networks, explained to me: What Wiener created was excitement to believe that one day we are going to make an intelligent machine. He wasn't a computer scientist. He talked feedback, he talked communication, he talked analog. His working metaphor was a feedback circuit, which he was an expert in. By the time the digital age began in the early 1960s people wanted to talk programming, talk codes, talk about computational functions, talk about short-term memory, long-term memory— meaningful computer metaphors. Wiener wasn’t part of that, and he didn’t reach the new generation that germinated with his ideas. His metaphors were too old, passé. There were new means already available that were ready to capture the human imagination.” By 1970, people were no longer talking about Wiener. One critical factor missing in Wiener’s vision was the cognitive element: mind, thinking, intelligence. As early as 1942, at the first of a series of foundational interdisciplinary meetings about the control of complex systems that would come to be known as the Macy conferences, leading researchers were arguing for the inclusion of the cognitive element into the conversation. While von Neumann, Shannon, and Wiener were concerned about systems of control and communication of observed systems, Warren McCullough wanted to include mind. He turned to cultural anthropologists Gregory Bateson and Margaret Mead to make the connection to the social sciences. Bateson in particular was increasingly talking about patterns and processes, or “the pattern that connects.” He called for a new kind of systems ecology in which organisms and the environment in which they live are one in the same, and should be considered as a single circuit. By the early 1970s the Cybernetics of observed systems—1* order Cybernetics— moved to the Cybernetics of observing systems—2"™ order Cybernetics—or “the Cybernetics of Cybernetics”, as coined by Heinz von Foerster, who joined the Macy conferences in the mid 1950s, and spearheaded the new movement. Cybernetics, rather than disappearing, was becoming metabolized into everything, so we no longer saw it as a separate, distinct new discipline. And there it remains, hiding in plain sight. “The Shtick of the Steins” My own writing about these issues at the time was on the radar screen of the 2"4 order Cybernetics crowd, including Heinz von Foerster as well as John Lilly and Alan Watts, who were the co-organizers of something called "The AUM Conference," shorthand for “The American University of Masters”, which took place in Big Sur in 1973, a gathering of philosophers, psychologists, and scientists, each of whom asked to lecture on his own work in terms of its relationship to the ideas of British mathematician G. Spencer Brown presented in his book, Laws of Form. I was a bit puzzled when I received an invitation—a very late invitation indeed— which they explained was based on their interest in the ideas I presented in a book called Afterwords, which were very much on their wavelength. I jumped at the opportunity, the main reason being that the keynote speaker was none other than Richard Feynman. I love 12 HOUSE_OVERSIGHT_016815
to spend time with physicists, the reason being that they think about the universe, 1.e. everything. And no physicist was reputed to be articulate as Feynman. I couldn’t wait to meet him. I accepted. That said, I am not a scientist, and I had never entertained the idea of getting on a stage and delivering a “lecture” of any kind, least of all a commentary on an obscure mathematical theory in front of a group identified as the world’s most interesting thinkers. Only upon my arrival in Big Sur did I find out the reason for my very late invitation. “When is Feynman’s talk?” I asked at the desk. “Oh, didn’t Alan Watts tell you? Richard is ill and has been hospitalized. You’re his replacement. And, by the way, what’s the title of your keynote lecture?” I tried to make myself invisible for several days. Alan Watts, realizing that I was avoiding the podium, woke me up one night with a 3am knock on the door of my room. I opened the door to find him standing in front of me wearing a monk’s robe with a hood that covering much of his face. His arms extended, he held a lantern in one hand, and a magnum of scotch on the other. “John”, he said in a deep voice with a rich aristocratic British accent, “you are a phony.” “And, John”, he continued, Iam a phony. But John, Iam a real phony!” The next day I gave my lecture, entitled "Einstein, Gertrude Stein, Wittgenstein, and Frankenstein." Einstein: the revolution in 20" century physics; Gertrude Stein: the first writer who made integral to her work the idea of an indeterminate and discontinuous universe. Words represented neither character nor activity: A rose is a rose is a rose, and a universe is a universe is a universe. ); Wittgenstein: the world as limits of language. “The limits of my language mean the limits of my world”. The end of the distinction between observer and observed. Frankenstein: Cybernetics AI, robotics, all the essayists in this volume. The lecture had unanticipated consequences. Among the participants at the AUM Conference were several authors of #1 New York Times bestsellers, yet no one there had a literary agent. And I realized that all were engaged in writing a genre of book both unnamed and unrecognized by New York publishers. Since I had an MBA from Colombia Business School, and a series of relative successes in business, I was dragooned into becoming an agent, initially for Gregory Bateson and John Lilly, whose books I sold quickly, and for sums that caught my attention, thus kick-starting my career as a literary agent. I never did meet Richard Feynman. The Long AI Winters This new career put me in close touch with most of the AI pioneers, and over the decades I rode with them on waves of enthusiasm, and into valleys of disappointment. In the early ‘80s the Japanese government mounted a national effort to advance AI. They called it the 5“ Generation; their goal was to change the architecture of computation by breaking “the von Neumann bottleneck”, by creating a massively parallel computer. In so doing, they hoped to jumpstart their economy and become a dominant world power in the field. In1983, the leader of the Japanese 5“ Generation consortium came to New York for a meeting organized by Heinz Pagels, the president of the New York Academy of Sciences. I had a seat at the table alongside the leaders of the 1“ generation, Marvin Minsky and John McCarthy, the 2"! generation, Edward Feigenbaum 13 HOUSE_OVERSIGHT_016816
and Roger Schank, and Joseph Traub, head of the National Supercomputer Consortium. In 1981 with Heinz’s help, I had founded “The Reality Club” (the precursor to the non-profit Edge.org), whose initial interdisciplinary meetings took place in the Board Room at the NYAS. Heinz was working on his book, Dreams of Reason: The Rise of the Science of Complexity, which he considered to be a research agenda for science in the 1990's. Through the Reality Club meetings, I got to know two young researchers who were about to play key roles in revolutionizing computer science. At MIT in the late seventies, Danny Hillis developed the algorithms that made possible the massively parallel computer. In 1983, his company, Thinking Machines, built the world's fastest supercomputer by utilizing parallel architecture. His "connection machine," closely reflected the workings of the human mind. Seth Lloyd at Rockefeller University was undertaking seminal work in the fields of quantum computation and quantum communications, including proposing the first technologically feasible design for a quantum computer. And the Japanese? Their foray into artificial intelligence failed, and was followed by twenty years of anemic economic growth. But, the leading US scientists took this program very seriously. And Feigenbaum, who was the cutting-edge computer scientist of the day, teamed up with McCorduck to write a book on these developments. Zhe Fifth Generation: Artificial Intelligence and Japan's Computer Challenge to the World was published in 1983. We had a code name for the project: “It’s coming, it’s coming!” But it didn’t come; it went. From that point on ’ve worked with researchers in nearly every variety of AI and complexity, including Rodney Brooks, Hans Moravec, John Archibald Wheeler, Benoit Mandelbrot, John Henry Holland, Danny Hillis, Freeman Dyson, Chris Langton, Doyne Farmer, Geoffrey West, Stuart Russell, and Judea Pearl. An Ongoing Dynamical Emergent System From the initial meeting in Washington, CT to the present, I arranged a number of dinners and discussions in London and Cambridge, Massachusetts, as well as a public event at London’s City Hall. Among the attendees were distinguished scientists, science historians, and communications theorists, all of whom have been thinking seriously about AI issues for their entire careers. I commissioned essays from a wide range of contributors, with or without references to Wiener (leaving it up to each participant). In the end, 25 people wrote essays, all individuals concerned about what is happening today in the age of AI. Deep Thinking in not my book, rather it is our book: Seth Lloyd, Judea Pearl, Stuart Russell, George Dyson, Daniel C. Dennett, Rodney Brooks, Frank Wilczek, Max Tegmark, Jaan Tallinn, Steven Pinker, David Deutsch, Tom Griffiths, Anca Dragan, Chris Anderson, David Kaiser, Neil Gershenfeld, W. Daniel Hillis, Venki Ramakrishnan, Alex “Sandy” Pentland, Hans Ulrich Obrist, Alison Gopnik, Peter Galison, George M. Church, Caroline A. Jones, Stephen Wolfram. I see The Deep Thinking Project as an ongoing dynamical emergent system, a presentation of the ideas of a community of sophisticated thinkers who are bringing their experience and erudition to bear in challenging the prevailing digital AI narrative as they 14 HOUSE_OVERSIGHT_016817
communicate their thoughts to one another. The aim is to present a mosaic of views which will help make sense out of this rapidly emerging field. I asked the essayists to consider: (a) The Zen-like poem “Thirteen Ways of Looking at a Blackbird,” by Wallace Stevens, which he insisted was “not meant to be a collection of epigrams or of ideas, but of sensations.” It is an exercise in “perspectivism,” consisting of short, separate sections, each of which mentions blackbirds in some way. The poem is about his own imagination; it concerns what he attends to. (b) The parable of the blind men and an elephant. Like the elephant, AI is too big a topic for any one perspective, never mind the fact that no two people seem to see things the same way. What do we want the book to do? Stewart Brand has noted that “revisiting pioneer thinking is perpetually useful. And it gives a long perspective that invites thinking in decades and centuries about the subject. All contemporary discussion, is bound to age badly and immediately without the longer perspective.” Danny Hillis wants people in AI to realize how they’ ve been programmed by Wiener’s book. “You’re executing its road map,” he says, and you just don’t realize it.” Dan Dennett would like to “let Wiener emerge as the ghost at the banquet. Think of it as a source of hybrid vigor, a source of unsettling ideas to shake up the established mindset.” Neil Gershenfeld argues that “stealth remedial education for the people running the “Big Five” would be a great output from the book.” Freeman Dyson Freeman, one of the few people alive who knew Wiener, notes that “Zhe Human Use of Human Beings is one of the best books ever written. Wiener got almost everything right. I will be interested to see what your bunch of wizards will do with it.” The Evolving AI Narrative Things have changed—and they remain the same. Now AI is everywhere. We have the Internet. We have our smartphones. The founders of the dominant companies—the companies that hold “the whip that lashes us’—have net worths of $65 billion, $90 billion, $130 billion. High-profile individuals such as Elon Musk, Nick Bostrom, Martin Rees, Eliezer Yudkowsky, and the late Stephen Hawking have issued dire warnings about AI, resulting in the ascendancy of well-funded institutes tasked with promoting “Nice AI.” But will we, as a species, be able to control a fully realized, unsupervised, self- improving AI? Wiener’s warnings and admonitions in Zhe Human Use of Human Beings are now very real, and they need to be looked at anew by researchers at the forefront of the AI revolution. Here is Dyson again: Wiener became increasingly disenchanted with the “gadget worshipers” whose corporate selfishness brought “motives to automatization that go beyond a legitimate curiosity and are sinful in themselves.” He knew the danger was not machines becoming more like humans but humans being treated like machines. “The world of the future will be an ever more demanding struggle against the limitations of our intelligence,” he warned in God & Golem, Inc., published in 15 HOUSE_OVERSIGHT_016818
1964, the year of his death, “not a comfortable hammock in which we can lie down to be waited upon by our robot slaves.” It’s time to examine the evolving AI narrative by identifying the leading members of that mainstream community along with the dissidents, and presenting their counternarratives in their own voices. The essays that follow thus constitute a much-needed update from the field. John Brockman New York, 2019 16 HOUSE_OVERSIGHT_016819
I met Seth Lloyd in the late 1980s, when new ways of thinking were everywhere: the importance of biological organizing principles, the computational view of mathematics and physical processes, the emphasis on parallel networks, the importance of nonlinear dynamics, the new understanding of chaos, connectionist ideas, neural networks, and parallel distributive processing. The advances in computation during that period provided us with a new way of thinking about knowledge. Seth likes to refer to himself as a quantum mechanic. He is internationally known jor his work in the field of quantum computation, which attempts to harness the exotic properties of quantum theory, like superposition and entanglement, to solve problems that would take several lifetimes to solve on classical computers. In the essay that follows, he traces the history of information theory from Norbert Wiener ’s prophetic insights to the predictions of a technological “singularity” that some would have us believe will supplant the human species. His takeaway on the recent programming method known as deep learning is to call for a more modest set of expectations; he notes that despite AI’s enormous advances, robots “still can’t tie their own shoes.” It’s difficult for me to talk about Seth without referencing his relationship with his friend and professor, the late theoretical physicist Heinz Pagels of Rockefeller University. The graduate student and the professor each had a profound effect on each other’s ideas. In the summer of 1988, I visited Heinz and Seth at the Aspen Center for Physics. Their joint work on the subject of complexity was featured in the current issue of Scientific American; they were ebullient. That was just two weeks before Heinz’s tragic death in a hiking accident while descending Pyramid Peak with Seth. They were talking about quantum computing. 17 HOUSE_OVERSIGHT_016820
WRONG, BUT MORE RELEVANT THAN EVER Seth Lloyd Seth Lloyd is a theoretical physicist at MIT, Nam P. Suh Professor in the Department of Mechanical Engineering, and an external professor at the Santa Fe Institute. The Human Use of Human Beings, Norbert Wiener’s 1950 popularization of his highly influential book Cybernetics: Control and Communication in the Animal and the Machine (1948), investigates the interplay between human beings and machines in a world in which machines are becoming ever more computationally capable and powerful. It is aremarkably prescient book, and remarkably wrong. Written at the height of the Cold War, it contains a chilling reminder of the dangers of totalitarian organizations and societies, and of the danger to democracy when it tries to combat totalitarianism with totalitarianism’s own weapons. Wiener’s Cybernetics looked in close scientific detail at the process of control via feedback. (“Cybernetics,” from the ancient Greek for “helmsman,” is the etymological basis of our word “governor,” which is what James Watt called his pathbreaking feedback control device that transformed the use of steam engines.) Because he was immersed in problems of control, Wiener saw the world as a set of complex, interlocking feedback loops, in which sensors, signals, and actuators such as engines interact via an intricate exchange of signals and information. The engineering applications of Cybernetics were tremendously influential and effective, giving rise to rockets, robots, automated assembly lines, and a host of precision-engineering techniques—in other words, to the basis of contemporary industrial society. Wiener had greater ambitions for cybernetic concepts, however, and in 7he Human Use of Human Beings he spells out his thoughts on its application to topics as diverse as Maxwell’s Demon, human language, the brain, insect metabolism, the legal system, the role of technological innovation in government, and religion. These broader applications of cybernetics were an almost unequivocal failure. Vigorously hyped from the late 1940s to the early 1960s—to a degree similar to the hype of computer and communication technology that led to the dotcom crash of 2000-200 1—cybernetics delivered satellites and telephone switching systems but generated few if any useful developments in social organization and society at large. Nearly seventy years later, however, Ze Human Use of Human Beings has more to teach us humans than it did the first time around. Perhaps the most remarkable feature of the book is that it introduces a large number of topics concerning human/machine interactions that are still of considerable relevance. Dark in tone, the book makes several predictions about disasters to come in the second half of the 20th century, many of which are almost identical to predictions made today about the second half of the 21st. For example, Wiener foresaw a moment in the near future of 1950 in which humans would cede control of society to a cybernetic artificial intelligence, which would then proceed to wreak havoc on humankind. The automation of manufacturing, Wiener predicted, would both create large advances in productivity and displace many workers from their jobs—a sequence of events that did indeed come to pass in the ensuing decades. Unless society could find productive occupations for these displaced workers, Wiener warned, revolt would ensue. 18 HOUSE_OVERSIGHT_016821
But Wiener failed to foresee crucial technological developments. Like pretty much all technologists of the 1950s, he failed to predict the computer revolution. Computers, he thought, would eventually fall in price from hundreds of thousands of (1950s) dollars to tens of thousands; neither he nor his compeers anticipated the tremendous explosion of computer power that would follow the development of the transistor and the integrated circuit. Finally, because of his emphasis on control, Wiener could not foresee a technological world in which innovation and self-organization bubble up from the bottom rather than being imposed from the top. Focusing on the evils of totalitarianism (political, scientific, and religious), Wiener saw the world in a deeply pessimistic light. His book warned of the catastrophe that awaited us if we didn’t mend our ways, fast. The current world of human beings and machines, more than a half century after its publication, is much more complex, richer, and contains a much wider variety of political, social, and scientific systems than he was able to envisage. The warnings of what will happen if we get it wrong, however—for example, control of the entire Internet by a global totalitarian regime—remain as relevant and pressing today as they were in 1950. What Wiener Got Right Wiener’s most famous mathematical works focused on problems of signal analysis and the effects of noise. During World War II, he developed techniques for aiming anti- aircraft fire by making models that could predict the future trajectory of an airplane by extrapolating from its past behavior. In Cybernetics and in The Human Use of Human Beings, Wiener notes that this past behavior includes quirks and habits of the human pilot, thus a mechanized device can predict the behavior of humans. Like Alan Turing, whose Turing Test suggested that computing machines could give responses to questions which were indistinguishable from human responses, Wiener was fascinated by the notion of capturing human behavior by mathematical description. In the 1940s, he applied his knowledge of control and feedback loops to neuro-muscular feedback in living systems, and was responsible for bringing Warren McCulloch and Walter Pitts to MIT, where they did their pioneering work on artificial neural networks. Wiener’s central insight was that the world should be understood in terms of information. Complex systems, such as organisms, brains, and human societies, consist of interlocking feedback loops in which signals exchanged between subsystems result in complex but stable behavior. When feedback loops break down, the system goes unstable. He constructed a compelling picture of how complex biological systems function, a picture that is by and large universally accepted today. Wiener’s vision of information as the central quantity in governing the behavior of complex systems was remarkable at the time. Nowadays, when cars and refrigerators are jammed with microprocessors and much of human society revolves around computers and cell phones connected by the Internet, it seems prosaic to emphasize the centrality of information, computation, and communication. In Wiener’s time, however, the first digital computers had only just come into existence, and the Internet was not even a twinkle in the technologist’s eye. Wiener’s powerful conception of not just engineered complex systems but all complex systems as revolving around cycles of signals and computation led to tremendous contributions to the development of complex human-made systems. The 19 HOUSE_OVERSIGHT_016822
methods he and others developed for the control of missiles, for example, were later put to work in building the Saturn V moon rocket, one of the crowning engineering achievements of the 20th century. In particular, Wiener’s applications of cybernetic concepts to the brain and to computerized perception are the direct precursors of today’s neural-network-based deep-learning circuits, and of artificial intelligence itself. But current developments in these fields have diverged from his vision, and their future development may well affect the human uses both of human beings and of machines. What Wiener Got Wrong It is exactly in the extension of the cybernetic idea to human beings that Wiener’s conceptions missed their target. Setting aside his ruminations on language, law, and human society for the moment, look at a humbler but potentially useful innovation that he thought was imminent in 1950. Wiener notes that prosthetic limbs would be much more effective if their wearers could communicate directly with their prosthetics by their own neural signals, receiving information about pressure and position from the limb and directing its subsequent motion. This turned out to be a much harder problem than Wiener envisaged: Seventy years down the road, prosthetic limbs that incorporate neural feedback are still in the very early stages. Wiener’s concept was an excellent one—it’s just that the problem of interfacing neural signals with mechanical-electrical devices is hard. More significantly, Wiener (along with pretty much everyone else in 1950) greatly underappreciated the potential of digital computation. As noted, Wiener’s mathematical contributions were to the analysis of signals and noise and his analytic methods apply to continuously varying, or analog, signals. Although he participated in the wartime development of digital computation, he never foresaw the exponential explosion of computing power brought on by the introduction and progressive miniaturization of semiconductor circuits. This is hardly Wiener’s fault: The transistor hadn’t been invented yet, and the vacuum-tube technology of the digital computers he was familiar with was clunky, unreliable, and unscalable to ever larger devices. In an appendix to the 1948 edition of Cybernetics, he anticipates chess-playing computers and predicts that they’ll be able to look two or three moves ahead. He might have been surprised to learn that within half a century a computer would beat the human world champion at chess. Technological Overestimation and the Existential Risks of the Singularity When Wiener wrote his books, a significant example of technological overestimation was about to occur. The 1950s saw the first efforts at developing artificial intelligence, by researchers such as Herbert Simon, John McCarthy, and Marvin Minsky, who began to program computers to perform simple tasks and to construct rudimentary robots. The success of these initial efforts inspired Simon to declare that “machines will be capable, within twenty years, of doing any work a man can do.” Such predictions turned out to be spectacularly wrong. As they became more powerful, computers got better and better at playing chess because they could systematically generate and evaluate a vast selection of possible future moves. But the majority of predictions of AI, e.g., robotic maids, turned out to be illusory. When Deep Blue beat Garry Kasparov at chess in 1997, the most 20 HOUSE_OVERSIGHT_016823
powerful room-cleaning robot was a Roomba, which moved around vacuuming at random and squeaked when it got caught under the couch. Technological prediction is particularly chancy, given that technologies progress by a series of refinements, halted by obstacles and overcome by innovation. Many obstacles and some innovations can be anticipated, but more cannot. In my own work with experimentalists on building quantum computers, I typically find that some of the technological steps I expect to be easy turn out to be impossible, whereas some of the tasks I imagine to be impossible turn out to be easy. You don’t know until you try. In the 1950s, partly inspired by conversations with Wiener, John von Neumann introduced the notion of the “technological singularity.” Technologies tend to improve exponentially, doubling in power or sensitivity over some interval of time. (For example, since 1950, computer technologies have been doubling in power roughly every two years, an observation enshrined as Moore’s Law.) Von Neumann extrapolated from the observed exponential rate of technological improvement to predict that “technological progress will become incomprehensively rapid and complicated,” outstripping human capabilities in the not too distant future. Indeed, if one extrapolates the growth of raw computing power—expressed in terms of bits and bit flips—into the future at its current rate, computers should match human brains sometime in the next two to four decades (depending on how one estimates the information-processing power of human brains). The failure of the initial overly optimistic predictions of AI dampened talk about the technological singularity for a few decades, but since the 2005 publication of Ray Kurzweil’s The Singularity is Near, the idea of technological advance leading to superintelligence is back in force. Some believers, Kurzweil included, regard this singularity as an opportunity: Humans can merge their brains with the superintelligence and thereby live forever. Others, such as Stephen Hawking and Elon Musk, worried that this superintelligence would prove to be malign and regarded it as the greatest existing threat to human civilization. Still others, including some of the contributors to the present volume, think such talk is overblown. Wiener’s life work and his failure to predict its consequences are intimately bound up in the idea of an impending technological singularity. His work on neuroscience and his initial support of McCulloch and Pitts adumbrated the startlingly effective deep-learning methods of the present day. Over the past decade, and particularly in the last five years, such deep-learning techniques have finally exhibited what Wiener liked to call Gestalt—for example, the ability to recognize that a circle is a circle even if when slanted sideways it looks like an ellipse. His work on control, combined with his work on neuromuscular feedback, was significant for the development of robotics and is the inspiration for neural-based human/machine interfaces. His lapses in technological prediction, however, suggest that we should take the notion of a technological singularity with a grain of salt. The general difficulties of technological prediction and the problems specific to the development of a superintelligence should warn us against overestimating both the power and the efficacy of information processing. The Arguments for Singularity Skepticism No exponential increase lasts forever. An atomic explosion grows exponentially, but 21 HOUSE_OVERSIGHT_016824
only until it runs out of fuel. Similarly, the exponential advances in Moore’s Law are starting to run into limits imposed by basic physics. The clock speed of computers maxed out at a few gigahertz a decade and a half ago, simply because the chips were starting to melt. The miniaturization of transistors is already running into quantum- mechanical problems due to tunneling and leakage currents. Eventually, the various exponential improvements in memory and processing driven by Moore’s Law will grind to a halt. A few more decades, however, will probably be time enough for the raw information-processing power of computers to match that of brains—at least by the crude measures of number of bits and number of bit-flips per second. Human brains are intricately constructed, the process of millions of years of natural selection. In Wiener’s time, our understanding of the architecture of the brain was rudimentary and simplistic. Since then, increasingly sensitive instrumentation and imaging techniques have shown our brains to be far more varied in structure and complex in function than Wiener could have imagined. I recently asked Tomaso Poggio, one of the pioneers of modern neuroscience, whether he was worried that computers, with their rapidly increasing processing power, would soon emulate the functioning of the human brain. “Not a chance,” he replied. The recent advances in deep learning and neuromorphic computation are very good at reproducing a particular aspect of human intelligence focused on the operation of the brain’s cortex, where patterns are processed and recognized. These advances have enabled a computer to beat the world champion not just of chess but of Go, an impressive feat, but they’re far short of enabling a computerized robot to tidy a room. (In fact, robots with anything approaching human capability in a broad range of flexible movements are still far away—search “robots falling down.” Robots are good at making precision welds on assembly lines, but they still can’t tie their own shoes.) Raw information-processing power does not mean sophisticated information- processing power. While computer power has advanced exponentially, the programs by which computers operate have often failed to advance at all. One of the primary responses of software companies to increased processing power is to add “useful” features which often make the software harder to use. Microsoft Word reached its apex in 1995 and has been slowly sinking under the weight of added features ever since. Once Moore’s Law starts slowing down, software developers will be confronted with hard choices between efficiency, speed, and functionality. A major fear of the singulariteers is that as computers become more involved in designing their own software they’! rapidly bootstrap themselves into achieving superhuman computational ability. But the evidence of machine learning points in the opposite direction. As machines become more powerful and capable of learning, they learn more and more as human beings do—from multiple examples, often under the supervision of human and machine teachers. Education is as hard and slow for computers as it is for teenagers. Consequently, systems based on deep learning are becoming more rather than less human. The skills they bring to learning are not “better than” but “complementary to” human learning: Computer learning systems can identify patterns that humans cannot—and vice versa. The world’s best chess players are neither computers nor humans but humans working together with computers. Cyberspace is indeed inhabited by harmful programs, but these primarily take the form of malware—viruses notable for their malign mindlessness, not for their 22 HOUSE_OVERSIGHT_016825
superintelligence. Whither Wiener Wiener noted that exponential technological progress is a relatively modern phenomenon and not all of itis good. He regarded atomic weapons and the development of missiles with nuclear warheads as a recipe for the suicide of the human species. He compared the headlong exploitation of the planet’s resources with the Mad Tea Party of Alice in Wonderland: Having laid waste to one local environment, we make progress simply by moving on to lay waste to the next. Wiener’s optimism about the development of computers and neuro-mechanical systems was tempered by his pessimism about their exploitation by authoritarian governments, such as the Soviet Union, and the tendency for democracies, such as the United States, to become more authoritarian themselves in confronting the threat of authoritarianism. What would Wiener think of the current human use of human beings? He would be amazed by the power of computers and the Internet. He would be happy that the early neural nets in which he played a role have spawned powerful deep-learning systems that exhibit the perceptual ability he demanded of them—although he might not be impressed that one of the most prominent examples of such computerized Gestalt is the ability to recognize photos of kittens on the World Wide Web. Rather than regarding machine intelligence as a threat, I suspect he would regard it as a phenomenon in its own right, different from and co-evolving with our own human intelligence. Unsurprised by global warming—the Mad Tea Party of our era—Wiener would applaud the exponential improvement in alternative-energy technologies and would apply his cybernetic expertise to developing the intricate set of feedback loops needed to incorporate such technologies into the coming smart electrical grid. Nonetheless, recognizing that the solution to the problem of climate change is at least as much political as it is technological, he would undoubtedly be pessimistic about our chances of solving this civilization-threatening problem in time. Wiener hated hucksters—political hucksters most of all—but he acknowledged that hucksters would always be with us. It’s easy to forget just how scary Wiener’s world was. The United States and the Soviet Union were in a full-out arms race, building hydrogen bombs mounted on nuclear warheads carried by intercontinental ballistic missiles guided by navigation systems to which Wiener himself—to his dismay—had contributed. I was four years old when Wiener died. In 1964, my nursery school class was practicing duck-and-cover under our desks to prepare for a nuclear attack. Given the human use of human beings in his own day, if he could see our current state, Wiener’s first response would be to be relieved that we are still alive. 20 HOUSE_OVERSIGHT_016826
In the 1980s, Judea Pearl introduced a new approach to artificial intelligence called Bayesian networks. This probability-based model of machine reasoning enabled machines to function—in a complex and uncertain world—as “evidence engines,’ continuously revising their beliefs in light of new evidence. Within a few years, Judea’s Bayesian networks had completely overshadowed the previous rule-based approaches to artificial intelligence. The advent of deep learning— in which computers, in effect, teach themselves to be smarter by observing tons of data, has given him pause, because this method lacks transparency. While recognizing the impressive achievements in deep learning by colleagues such as Michael Jordan and Geoffrey Hinton, he feels uncomfortable with this kind of opacity. He set out to understand the theoretical limitations of deep-learning systems and points out that basic barriers exist that will prevent them from achieving a human kind of intelligence, no matter what we do. Leveraging the computational benefits of Bayesian networks, Judea realized that the combination of simple graphical models and data could also be used to represent and infer cause-effect relationships. The significance of this discovery far transcends its roots in artificial intelligence. His latest book explains causal thinking to the general public; you might say it is a primer on how to think even though human. Judea’s principled, mathematical approach to causality is a profound contribution to the realm of ideas. It has already benefited virtually every field of inquiry, especially the data-intensive health and social sciences. 3 24 HOUSE_OVERSIGHT_016827
THE LIMITATIONS OF OPAQUE LEARNING MACHINES Judea Pearl Judea Pearl is a professor of computer science and director of the Cognitive Systems Laboratory at UCLA. His most recent book, co-authored with Dana Mackenzie, is The Book of Why: The New Science of Cause and Effect. As a former physicist, I was extremely interested in cybernetics. Though it did not utilize the full power of Turing Machines, it was highly transparent, perhaps because it was founded on classical control theory and information theory. We are losing this transparency now, with the deep-learning style of machine learning. It is fundamentally a curve-fitting exercise that adjusts weights in intermediate layers of a long input-output chain. I find many users who say that it “works well and we don’t know why.” Once you unleash it on large data, deep learning has its own dynamics, it does its own repair and its own optimization, and it gives you the right results most of the time. But when it doesn’t, you don’t have a clue about what went wrong and what should be fixed. In particular, you do not know if the fault is in the program, in the method, or because things have changed in the environment. We should be aiming at a different kind of transparency. Some argue that transparency is not really needed. We don’t understand the neural architecture of the human brain, yet it runs well, so we forgive our meager understanding and use human helpers to great advantage. In the same way, they argue, why not unleash deep-learning systems and create intelligence without understanding how they work? I buy this argument to some extent. I personally don’t like opacity, so I won’t spend my time on deep learning, but I know that it has a place in the makeup of intelligence. I know that non-transparent systems can do marvelous jobs, and our brain is proof of that marvel. But this argument has its limitation. The reason we can forgive our meager understanding of how human brains work is because our brains work the same way, and that enables us to communicate with other humans, learn from them, instruct them, and motivate them in our own native language. If our robots will all be as opaque as AlphaGo, we won’t be able to hold a meaningful conversation with them, and that would be unfortunate. We will need to retrain them whenever we make a slight change in the task or in the operating environment. So, rather than experimenting with opaque learning machines, I am trying to understand their theoretical limitations and examine how these limitations can be overcome. I do it in the context of causal-reasoning tasks, which govern much of how scientists think about the world and, at the same time, are rich in intuition and toy examples, so we can monitor the progress in our analysis. In this context, we’ve discovered that some basic barriers exist, and that unless they are breached we won’t get areal human kind of intelligence no matter what we do. I believe that charting these barriers may be no less important than banging our heads against them. Current machine-learning systems operate almost exclusively in a statistical, or model-blind, mode, which is analogous in many ways to fitting a function to a cloud of data points. Such systems cannot reason about “what if ?” questions and, therefore, 25 HOUSE_OVERSIGHT_016828
cannot serve as the basis for Strong Al—that is, artificial intelligence that emulates human-level reasoning and competence. To achieve human-level intelligence, learning machines need the guidance of a blueprint of reality, a model—similar to a road map that guides us in driving through an unfamiliar city. To be more specific, current learning machines improve their performance by optimizing parameters for a stream of sensory inputs received from the environment. It is a slow process, analogous to the natural-selection process that drives Darwinian evolution. It explains how species like eagles and snakes have developed superb vision systems over millions of years. It cannot explain, however, the super-evolutionary process that enabled humans to build eyeglasses and telescopes over barely a thousand years. What humans had that other species lacked was a mental representation of their environment—representations that they could manipulate at will to imagine alternative hypothetical environments for planning and learning. Historians of Homo sapiens such as Yuval Noah Harari and Steven Mithen are in general agreement that the decisive ingredient that gave our ancestors the ability to achieve global dominion about forty thousand years ago was their ability to create and store a mental representation of their environment, interrogate that representation, distort it by mental acts of imagination, and finally answer the “What if?” kind of questions. Examples are interventional questions (“What if I do such-and-such?”) and retrospective or counterfactual questions (“What if I had acted differently?”). No learning machine in operation today can answer such questions. Moreover, most learning machines do not possess a representation from which the answers to such questions can be derived. With regard to causal reasoning, we find that you can do very little with any form of model-blind curve fitting, or any statistical inference, no matter how sophisticated the fitting process is. We have also found a theoretical framework for organizing such limitations, which forms a hierarchy. On the first level, you have statistical reasoning, which can tell you only how seeing one event would change your belief about another. For example, what can a symptom tell you about a disease? Then you have a second level, which entails the first but not vice versa. It deals with actions. “What will happen if we raise prices?” “What if you make me laugh?” That second level of the hierarchy requires information about interventions which is not available in the first. This information can be encoded in a graphical model, which merely tells us which variable responds to another. The third level of the hierarchy is the counterfactual. This is the language used by scientists. “What if the object were twice as heavy?” “What if I were to do things differently?” “Was it the aspirin that cured my headache, or the nap I took?” Counterfactuals are at the top level in the sense that they cannot be derived even if we could predict the effects of all actions. They need an extra ingredient, in the form of equations, to tell us how variables respond to changes in other variables. One of the crowning achievements of causal-inference research has been the algorithmization of both interventions and counterfactuals, the top two layers of the hierarchy. In other words, once we encode our scientific knowledge in a model (which may be qualitative), algorithms exist that examine the model and determine if a given query, be it about an intervention or about a counterfactual, can be estimated from the available data—and, if so, how. This capability has transformed dramatically the way 26 HOUSE_OVERSIGHT_016829
scientists are doing science, especially in such data-intensive sciences as sociology and epidemiology, for which causal models have become a second language. These disciplines view their linguistic transformation as the Causal Revolution. As Harvard social scientist Gary King puts it, “More has been learned about causal inference in the last few decades than the sum total of everything that had been learned about it in all prior recorded history.” As I contemplate the success of machine learning and try to extrapolate it to the future of AI, I ask myself, “Are we aware of the basic limitations that were discovered in the causal-inference arena? Are we prepared to circumvent the theoretical impediments that prevent us from going from one level of the hierarchy to another level?” I view machine learning as a tool to get us from data to probabilities. But then we still have to make two extra steps to go from probabilities into real understandingnce— two big steps. One is to predict the effect of actions, and the second is counterfactual imagination. We cannot claim to understand reality unless we make the last two steps. In his insightful book Foresight and Understanding (1961), the philosopher Stephen Toulmin identified the transparency-versus-opacity contrast as the key to understanding the ancient rivalry between Greek and Babylonian sciences. According to Toulmin, the Babylonian astronomers were masters of black-box predictions, far surpassing their Greek rivals in accuracy and consistency of celestial observations. Yet Science favored the creative-speculative strategy of the Greek astronomers, which was wild with metaphorical imagery: circular tubes full of fire, small holes through which celestial fire was visible as stars, and hemispherical Earth riding on turtleback. It was this wild modeling strategy, not Babylonian extrapolation, that jolted Eratosthenes (276- 194 BC) to perform one of the most creative experiments in the ancient world and calculate the circumference of the Earth. Such an experiment would never have occurred to a Babylonian data-fitter. Model-blind approaches impose intrinsic limitations on the cognitive tasks that Strong Al can perform. My general conclusion is that human-level AI cannot emerge solely from model-blind learning machines; it requires the symbiotic collaboration of data and models. Data science is a science only to the extent that it facilitates the interpretation of data—a two-body problem, connecting data to reality. Data alone are hardly a science, no matter how “big” they get and how skillfully they are manipulated. Opaque learning systems may get us to Babylon, but not to Athens. 27 HOUSE_OVERSIGHT_016830
Computer scientist Stuart Russell, along with Elon Musk, Stephen Hawking, Max Tegmark, and numerous others, has insisted that attention be paid to the potential dangers in creating an intelligence on the superhuman (or even the human) level—an AGI, or artificial general intelligence, whose programmed purposes may not necessarily align with our own. His early work was on understanding the notion of “bounded optimality” as a formal definition of intelligence that you can work on. He developed the technique of rational meta-reasoning, “which is, roughly speaking, that you do the computations that you expect to improve the quality of your ultimate decision as quickly as possible.” He has also worked on the unification of probability theory and first-order logic—resulting in a new and far more effective monitoring system for the Comprehensive Nuclear Test Ban Treaty—and on the problem of decision making over long timescales (his presentations on the latter topic are usually titled, “Life: Play and Win in 20 trillion moves”’). He is very concerned with the continuing development of autonomous weapons, such as lethal micro-drones, which are potentially scalable into weapons of mass destruction. He drafted the letter from forty of the world’s leading AI researchers to President Obama which resulted in high-level national-security meetings. His current work centers on the creation of what he calls “provably beneficial” AI. He wants to ensure AI safety by “imbuing systems with explicit uncertainty” about the objectives of their human programmers, an approach that would amount to a fairly radical reordering of current AI research. Stuart is also on the radar of anyone who has taken a course in computer science in the last twenty-odd years. He is co-author of “the” definitive AI textbook, with an estimated 5-million-plus English-language readers. 28 HOUSE_OVERSIGHT_016831
THE PURPOSE PUT INTO THE MACHINE Stuart Russell Stuart Russell is a professor of computer science and Smith-Zadeh Professor in Engineering at UC Berkeley. He is the coauthor (with Peter Norvig) of Artificial Intelligence: A Modern Approach. Among the many issues raised in Norbert Wiener’s The Human Use of Human Beings (1950) that are currently relevant, the most significant to the AI researcher is the possibility that humanity may cede control over its destiny to machines. Wiener considered the machines of the near future as far too limited to exert global control, imagining instead that machines and machine-like control systems would be wielded by human elites to reduce the great mass of humanity to the status of “cogs and levers and rods.” Looking further ahead, he pointed to the difficulty of correctly specifying objectives for highly capable machines, noting a few of the simpler and more obvious truths of life, such as that when a djinnee is found in a bottle, it had better be left there; that the fisherman who craves a boon from heaven too many times on behalf of his wife will end up exactly where he started; that if you are given three wishes, you must be very careful what you wish for. The dangers are clear enough: Woe to us if we let [the machine] decide our conduct, unless we have previously examined the laws of its action, and know fully that its conduct will be carried out on principles acceptable to us! On the other hand, the machine like the djinnee, which can learn and can make decisions on the basis of its learning, will in no way be obliged to make such decisions as we should have made, or will be acceptable to us. Ten years later, after seeing Arthur Samuel’s checker-playing program learn to play checkers far better than its creator, Wiener published “Some Moral and Technical Consequences of Automation” in Science. In this paper, the message is even clearer: If we use, to achieve our purposes, a mechanical agency with whose operation we cannot efficiently interfere ... we had better be quite sure that the purpose put into the machine is the purpose which we really desire. . . . In my view, this is the source of the existential risk from superintelligent AI cited in recent years by such observers as Elon Musk, Bill Gates, Stephen Hawking, and Nick Bostrom. Putting Purposes Into Machines The goal of AI research has been to understand the principles underlying intelligent behavior and to build those principles into machines that can then exhibit such behavior. In the 1960s and 1970s, the prevailing theoretical notion of intelligence was the capacity for logical reasoning, including the ability to derive plans of action guaranteed to achieve a specified goal. More recently, a consensus has emerged around the idea of a rational 29 HOUSE_OVERSIGHT_016832
agent that perceives, and acts in order to maximize, its expected utility. Subfields such as logical planning, robotics, and natural-language understanding are special cases of the general paradigm. AI has incorporated probability theory to handle uncertainty, utility theory to define objectives, and statistical learning to allow machines to adapt to new circumstances. These developments have created strong connections to other disciplines that build on similar concepts, including control theory, economics, operations research, and statistics. In both the logical-planning and rational-agent views of AI, the machine’s objective—whether in the form of a goal, a utility function, or a reward function (as in reinforcement learning)—1s specified exogenously. In Wiener’s words, this is “the purpose put into the machine.” Indeed, it has been one of the tenets of the field that AI systems should be general-purpose—i.e., capable of accepting a purpose as input and then achieving it—rather than special-purpose, with their goal implicit in their design. For example, a self-driving car should accept a destination as input instead of having one fixed destination. However, some aspects of the car’s “driving purpose” are fixed, such as that it shouldn’t hit pedestrians. This is built directly into the car’s steering algorithms rather than being explicit: No self-driving car in existence today “knows” that pedestrians prefer not to be run over. Putting a purpose into a machine which optimizes its behavior according to clearly defined algorithms seems an admirable approach to ensuring that the machine’s “conduct will be carried out on principles acceptable to us!” But, as Wiener warns, we need to put in the right purpose. We might call this the King Midas problem: Midas got exactly what he asked for—namely, that everything he touched would turn to gold—but too late he discovered the drawbacks of drinking liquid gold and eating solid gold. The technical term for putting in the right purpose is value alignment. When it fails, we may inadvertently imbue machines with objectives counter to our own. Tasked with finding a cure for cancer as fast as possible, an AI system might elect to use the entire human population as guinea pigs for its experiments. Asked to de-acidify the oceans, it might use up all the oxygen in the atmosphere as a side effect. This is a common characteristic of systems that optimize: Variables not included in the objective may be set to extreme values to help optimize that objective. Unfortunately, neither AI nor other disciplines (economics, statistics, control theory, operations research) built around the optimization of objectives have much to say about how to identify the purposes “we really desire.” Instead, they assume that objectives are simply implanted into the machine. AI research, in its present form, studies the ability to achieve objectives, not the design of those objectives. Steve Omohundro has pointed to a further difficulty, observing that intelligent entities must act to preserve their own existence. This tendency has nothing to do with a self-preservation instinct or any other biological notion; it’s just that an entity cannot achieve its objectives if it’s dead. According to Omohundro’s argument, a superintelligent machine that has an off-switch—which some, including Alan Turing himself, in a 1951 talk on BBC Radio 3, have seen as our potential salvation—will take steps to disable the switch in some way.! Thus we may face the prospect of superintelligent machines—their actions by definition unpredictable by us and their ' Omohundro, “The Basic AI Drives,” in Proc. First AGI Conf., 171: “Artificial General Intelligence,” eds. P. Wang, B. Goertzel, & S. Franklin (IOS press, 2008). 30 HOUSE_OVERSIGHT_016833
imperfectly specified objectives conflicting with our own—whose motivation to preserve their existence in order to achieve those objectives may be insuperable. 1001 Reasons to Pay No Attention Objections have been raised to these arguments, primarily by researchers within the AI community. The objections reflect a natural defensive reaction, coupled perhaps with a lack of imagination about what a superintelligent machine could do. None hold water on closer examination. Here are some of the more common ones: e Don’t worry, we can just switch it off? This is often the first thing that pops into a layperson’s head when considering risks from superintelligent Al—as if a superintelligent entity would never think of that. This is rather like saying that the risk of losing to DeepBlue or AlphaGo is negligible—all one has to do is make the right moves. e Human-level or superhuman AI is impossible This is an unusual claim for AI researchers to make, given that, from Turing onward, they have been fending off such claims from philosophers and mathematicians. The claim, which is backed by no evidence, appears to concede that if superintelligent AI were possible, it would be a significant risk. It’s as if a bus driver, with all of humanity as passengers, said, “Yes, I am driving toward a cliff—in fact, I’m pressing the pedal to the metal! But trust me, we’ll run out of gas before we get there!” The claim represents a foolhardy bet against human ingenuity. We have made such bets before and lost. On September 11, 1933, renowned physicist Ernest Rutherford stated, with utter confidence, “Anyone who expects a source of power from the transformation of these atoms is talking moonshine.” On September 12, 1933, Leo Szilard invented the neutron-induced nuclear chain reaction. A few years later he demonstrated such a reaction in his laboratory at Columbia University. As he recalled in a memoir: “We switched everything off and went home. That night, there was very little doubt in my mind that the world was headed for grief.” e It’s too soon to worry about it. The right time to worry about a potentially serious problem for humanity depends not just on when the problem will occur but also on how much time is needed to devise and implement a solution that avoids the risk. For example, if we were to detect a large asteroid predicted to collide with the Earth in 2067, would we say, “It’s too soon to worry”? And if we consider the global catastrophic risks from climate change predicted to occur later in this century, is it too soon to take action to prevent them? On the contrary, it may be too late. The relevant timescale for human-level AI is less predictable, but, like nuclear fission, it might arrive considerably sooner than expected. One variation on this argument is Andrew Ng’s statement that it’s “like worrying about overpopulation on Mars.” This appeals to a convenient analogy: Not only is the ? AI researcher Jeff Hawkins, for example, writes, “Some intelligent machines will be virtual, meaning they will exist and act solely within computer networks. .. . It is always possible to turn off a computer network, even if painful.” https:/Avww.recode.net/2015/3/2/11559576/. 3 The AI100 report (Peter Stone et al.), sponsored by Stanford University, includes the following: “Unlike in the movies, there is no race of superhuman robots on the horizon or probably even possible.” https://ail00.stanford.edu/20 16-report. 31 HOUSE_OVERSIGHT_016834
risk easily managed and far in the future, but also it’s extremely unlikely that we'd even try to move billions of humans to Mars in the first place. The analogy is a false one, however. We are already devoting huge scientific and technical resources to creating ever-more-capable AI systems. A more apt analogy would be a plan to move the human race to Mars with no consideration for what we might breathe, drink, or eat once we'd arrived. e Human-level Al isn’t really imminent, in any case. The AI100 report, for example, assures us, “Contrary to the more fantastic predictions for AI in the popular press, the Study Panel found no cause for concern that AI is an imminent threat to humankind.” This argument simply misstates the reasons for concern, which are not predicated on imminence. In his 2014 book, Superintelligence: Paths, Dangers, Strategies, Nick Bostrom, for one, writes, “It 1s no part of the argument in this book that we are on the threshold of a big breakthrough in artificial intelligence, or that we can predict with any precision when such a development might occur.” e You're just a Luddite. It’s an odd definition of Luddite that includes Turing, Wiener, Minsky, Musk, and Gates, who rank among the most prominent contributors to technological progress in the 20th and 21st centuries.* Furthermore, the epithet represents a complete misunderstanding of the nature of the concerns raised and the purpose for raising them. It is as if one were to accuse nuclear engineers of Luddism if they pointed out the need for control of the fission reaction. Some objectors also use the term “anti-AI,” which is rather like calling nuclear engineers “anti-physics.” The purpose of understanding and preventing the risks of AI is to ensure that we can realize the benefits. Bostrom, for example, writes that success in controlling AI will result in “a civilizational trajectory that leads to a compassionate and jubilant use of humanity’s cosmic endowment”—hardly a pessimistic prediction. e Any machine intelligent enough to cause trouble will be intelligent enough to have appropriate and altruistic objectives. (Often, the argument adds the premise that people of greater intelligence tend to have more altruistic objectives, a view that may be related to the self-conception of those making the argument.) This argument is related to Hume’s is-ought problem and G. E. Moore’s naturalistic fallacy, suggesting that somehow the machine, as a result of its intelligence, will simply perceive what is right, given its experience of the world. This is implausible; for example, one cannot perceive, in the design of a chessboard and chess pieces, the goal of checkmate; the same chessboard and pieces can be used for suicide chess, or indeed many other games still to be invented. Put another way: Where Bostrom imagines humans driven extinct by a putative robot that turns the planet into a sea of paper clips, we humans see this outcome as tragic, 4 Elon Musk, Stephen Hawking, and others (including, apparently, the author) received the 2015 Luddite of the Year Award from the Information Technology Innovation Foundation: https:/Atif.org/publications/20 16/0 1/19/artificial-intelligence-alarmists-win-itif/oE2%80%9 9s-annual- luddite-award. > Rodney Brooks, for example, asserts that it’s impossible for a program to be “smart enough that it would be able to invent ways to subvert human society to achieve goals set for it by humans, without understanding the ways in which it was causing problems for those same humans.” http://rodneybrooks.com/the-seven-deadly -sins-of-predicting-the-future -of-ai/. 32 HOUSE_OVERSIGHT_016835
whereas the iron-eating bactertum Thiobacillus ferrooxidans 1s thrilled. Who’s to say the bacterium is wrong? The fact that a machine has been given a fixed objective by humans doesn’t mean that it will automatically recognize the importance to humans of things that aren’t part of the objective. Maximizing the objective may well cause problems for humans, but, by definition, the machine will not recognize those problems as problematic. e Intelligence is multidimensional, “so ‘smarter than humans’ is a meaningless concept.”@ It is a staple of modern psychology that IQ doesn’t do justice to the full range of cognitive skills that humans possess to varying degrees. IQ is indeed a crude measure of human intelligence, but it is utterly meaningless for current AI systems, because their capabilities across different areas are uncorrelated. How do we compare the IQ of Google’s search engine, which cannot play chess, with that of DeepBlue, which cannot answer search queries? None of this supports the argument that because intelligence is multifaceted, we can ignore the risk from superintelligent machines. If “smarter than humans” is a meaningless concept, then “smarter than gorillas” is also meaningless, and gorillas therefore have nothing to fear from humans; clearly, that argument doesn’t hold water. Not only is it logically possible for one entity to be more capable than another across all the relevant dimensions of intelligence, it is also possible for one species to represent an existential threat to another even if the former lacks an appreciation for music and literature. Solutions Can we tackle Wiener’s warning head-on? Can we design AI systems whose purposes don’t conflict with ours, so that we’re sure to be happy with how they behave? On the face of it, this seems hopeless, because it will doubtless prove infeasible to write down our purposes correctly or imagine all the counterintuitive ways a superintelligent entity might fulfill them. If we treat superintelligent AI systems as if they were black boxes from outer space, then indeed we have no hope. Instead, the approach we seem obliged to take, if we are to have any confidence in the outcome, is to define some formal problem /’, and design AI systems to be F-solvers, such that no matter how perfectly a system solves F, we're guaranteed to be happy with the solution. If we can work out an appropriate F' that has this property, we’ll be able to create provably beneficial AI. Here’s an example of how nof to do it: Let a reward be a scalar value provided periodically by a human to the machine, corresponding to how well the machine has behaved during each period, and let F' be the problem of maximizing the expected sum of rewards obtained by the machine. The optimal solution to this problem is not, as one might hope, to behave well, but instead to take control of the human and force him or her to provide a stream of maximal rewards. This is known as the wireheading problem, based on observations that humans themselves are susceptible to the same problem if given a means to electronically stimulate their own pleasure centers. There is, I believe, an approach that may work. Humans can reasonably be described as having (mostly implicit) preferences over their future lives—that is, given @ Kevin Kelly, “The Myth of a Superhuman AI,” Wired, Apr. 25, 2017. 33 HOUSE_OVERSIGHT_016836
enough time and unlimited visual aids, a human could express a preference (or indifference) when offered a choice between two future lives laid out before him or her in all their aspects. (This idealization ignores the possibility that our minds are composed of subsystems with incompatible preferences; if true, that would limit a machine’s ability to optimally satisfy our preferences, but it doesn’t seem to prevent us from designing machines that avoid catastrophic outcomes.) The formal problem F to be solved by the machine in this case 1s to maximize human future-life preferences subject to its initial uncertainty as to what they are. Furthermore, although the future-life preferences are hidden variables, they’re grounded in a voluminous source of evidence—namely, all of the human choices ever made. This formulation sidesteps Wiener’s problem: The machine may learn more about human preferences as it goes along, of course, but it will never achieve complete certainty. A more precise definition is given by the framework of cooperative inverse- reinforcement learning, or CIRL. A CIRL problem involves two agents, one human and the other a robot. Because there are two agents, the problem is what economists call a game. It is a game of partial information, because while the human knows the reward function, the robot doesn’t—even though the robot’s job is to maximize it. A simple example: Suppose that Harriet, the human, likes to collect paper clips and staples and her reward function depends on how many of each she has. More precisely, if she has p paper clips and s staples, her degree of happiness is Op + (1-0)s, where @ is essentially an exchange rate between paper clips and staples. If @1is 1, she likes only paper clips; if @ is 0, she likes only staples; if 0 is 0.5, she is indifferent between them; and so on. It’s the job of Robby, the robot, to produce the paper clips and staples. The point of the game is that Robby wants to make Harriet happy, but he doesn’t know the value of @, so he isn’t sure how many of each to produce. Here’s how the game works. Let the true value of 0 be 0.49—that is, Harriet has a slight preference for staples over paper clips. And let’s assume that Robby has a uniform prior belief about 6—that is, he believes @ is equally likely to be any value between 0 and 1. Harriet now gets to do a small demonstration, producing either two paper clips or two staples or one of each. After that, the robot can produce either ninety paper clips, or ninety staples, or fifty of each. You might think that Harriet, who prefers staples to paper clips, should produce two staples. But in that case, Robby’s rational response would be to produce ninety staples (with a total value to Harriet of 45.9), which is a less desirable outcome for Harriet than fifty of each (total value 50.0). The optimal solution of this particular game is that Harriet produces one of each, so then Robby makes fifty of each. Thus, the way the game is defined encourages Harriet to “teach” Robby—as long as she knows that Robby is watching carefully. Within the CIRL framework, one can formulate and solve the off-switch problem—that is, the problem of how to prevent a robot from disabling its off-switch. (Turing may rest easier.) A robot that’s uncertain about human preferences actually benefits from being switched off, because it understands that the human will press the off-switch to prevent the robot from doing something counter to those preferences. Thus the robot is incentivized to preserve the off-switch, and this incentive derives directly from its uncertainty about human preferences.’ The off-switch example suggests some templates for controllable-agent T See Hadfield-Menell et al., “The Off-Switch Game,” https://arxiv.org/pdf/1611.08219. pdf. 34 HOUSE_OVERSIGHT_016837
designs and provides at least one case of a provably beneficial system in the sense introduced above. The overall approach resembles mechanism-design problems in economics, wherein one incentivizes other agents to behave in ways beneficial to the designer. The key difference here is that we are building one of the agents in order to benefit the other. There are reasons to think this approach may work in practice. First, there is abundant written and filmed information about humans doing things (and other humans reacting). Technology to build models of human preferences from this storehouse will presumably be available long before superintelligent AI systems are created. Second, there are strong, near-term economic incentives for robots to understand human preferences: If one poorly designed domestic robot cooks the cat for dinner, not realizing that its sentimental value outweighs its nutritional value, the domestic-robot industry will be out of business. There are obvious difficulties, however, with an approach that expects a robot to learn underlying preferences from human behavior. Humans are irrational, inconsistent, weak-willed, and computationally limited, so their actions don’t always reflect their true preferences. (Consider, for example, two humans playing chess. Usually, one of them loses, but not on purpose!) So robots can learn from nonrational human behavior only with the aid of much better cognitive models of humans. Furthermore, practical and social constraints will prevent all preferences from being maximally satisfied simultaneously, which means that robots must mediate among conflicting preferences—something that philosophers and social scientists have struggled with for millennia. And what should robots learn from humans who enjoy the suffering of others? It may be best to zero out such preferences in the robots’ calculations. Finding a solution to the AI control problem is an important task; it may be, in Bostrom’s words, “the essential task of our age.” Up to now, AI research has focused on systems that are better at making decisions, but this is not the same as making better decisions. No matter how excellently an algorithm maximizes, and no matter how accurate its model of the world, a machine’s decisions may be ineffably stupid in the eyes of an ordinary human if its utility function is not well aligned with human values. This problem requires a change in the definition of AI itself—from a field concerned with pure intelligence, independent of the objective, to a field concerned with systems that are provably beneficial for humans. Taking the problem seriously seems likely to yield new ways of thinking about AI, its purpose, and our relationship to it. 35 HOUSE_OVERSIGHT_016838
In 2005, George Dyson, a historian of science and technology, visited Google at the invitation of some Google engineers. The occasion was the sixtieth anniversary of John von Neumann's proposal for a digital computer. After the visit, George wrote an essay, “Turing’s Cathedral,” which, for the first time, alerted the public about what Google 's founders had in store for the world. “We are not scanning all those books to be read by people,” explained one of his hosts after his talk. “We are scanning them to be read by an Al.” George offers a counternarrative to the digital age. His interests have included the development of the Aleut kayak, the evolution of digital computing and telecommunications, the origins of the digital universe, and a path not taken into space. His career (he never finished high school, yet has been awarded an honorary doctorate jrom the University of Victoria) has proved as impossible to classify as his books. He likes to point out that analog computing, once believed to be as extinct as the differential analyzer, has returned. He argues that while we may use digital components, at a certain point the analog computing being performed by the system far exceeds the complexity of the digital code with which it is built. He believes that true artificial intelligence—with analog control systems emerging from a digital substrate the way digital computers emerged out of analog components in the aftermath of World War II— may not be as far off as we think. In this essay, George contemplates the distinction between analog and digital computation and finds analog to be alive and well. Nature’s response to an attempt to program machines to control everything may be machines without programming over which no one has control. 36 HOUSE_OVERSIGHT_016839
THE THIRD LAW George Dyson George Dyson is a historian of science and technology and the author of Baidarka: the Kayak, Darwin Among the Machines, Project Orion, and Turing’s Cathedral. The history of computing can be divided into an Old Testament and a New Testament: before and after electronic digital computers and the codes they spawned proliferated across the Earth. The Old Testament prophets, who delivered the underlying logic, included Thomas Hobbes and Gottfried Wilhelm Leibniz. The New Testament prophets included Alan Turing, John von Neumann, Claude Shannon, and Norbert Wiener. They delivered the machines. Alan Turing wondered what it would take for machines to become intelligent. John von Neumann wondered what it would take for machines to self-reproduce. Claude Shannon wondered what it would take for machines to communicate reliably, no matter how much noise intervened. Norbert Wiener wondered how long it would take for machines to assume control. Wiener’s warnings about control systems beyond human control appeared in 1949, just as the first generation of stored-program electronic digital computers were introduced. These systems required direct supervision by human programmers, undermining his concerns. What’s the problem, as long as programmers are in control of the machines? Ever since, debate over the risks of autonomous control has remained associated with the debate over the powers and limitations of digitally coded machines. Despite their astonishing powers, little real autonomy has been observed. This is a dangerous assumption. What if digital computing is being superseded by something else? Electronics underwent two fundamental transitions over the past hundred years: from analog to digital and from vacuum tubes to solid state. That these transitions occurred together does not mean they are inextricably linked. Just as digital computation was implemented using vacuum tube components, analog computation can be implemented in solid state. Analog computation is alive and well, even though vacuum tubes are commercially extinct. There 1s no precise distinction between analog and digital computing. In general, digital computing deals with integers, binary sequences, deterministic logic, and time that is idealized into discrete increments, whereas analog computing deals with real numbers, nondeterministic logic, and continuous functions, including time as it exists as a continuum in the real world. Imagine you need to find the middle of a road. You can measure its width using any available increment and then digitally compute the middle to the nearest increment. Or you can use a piece of string as an analog computer, mapping the width of the road to the length of the string and finding the middle, without being limited to increments, by doubling the string back upon itself. Many systems operate across both analog and digital regimes. A tree integrates a wide range of inputs as continuous functions, but if you cut down that tree, you find that it has been counting the years digitally all along. 37 HOUSE_OVERSIGHT_016840
In analog computing, complexity resides in network topology, not in code. Information is processed as continuous functions of values such as voltage and relative pulse frequency rather than by logical operations on discrete strings of bits. Digital computing, intolerant of error or ambiguity, depends upon error correction at every step along the way. Analog computing tolerates errors, allowing you to live with them. Nature uses digital coding for the storage, replication, and recombination of sequences of nucleotides, but relies on analog computing, running on nervous systems, for intelligence and control. The genetic system in every living cell is a stored-program computer. Brains aren’t. Digital computers execute transformations between two species of bits: bits representing differences in space and bits representing differences in time. The transformations between these two forms of information, sequence and structure, are governed by the computer’s programming, and as long as computers require human programmers, we retain control. Analog computers also mediate transformations between two forms of information: structure in space and behavior in time. There is no code and no programming. Somehow—and we don’t fully understand how—Nature evolved analog computers known as nervous systems, which embody information absorbed from the world. They learn. One of the things they learn is control. They learn to control their own behavior, and they learn to control their environment to the extent that they can. Computer science has a long history—going back to before there even was computer science—of implementing neural networks, but for the most part these have been simulations of neural networks by digital computers, not neural networks as evolved in the wild by Nature herself. This is starting to change: from the bottom up, as the threefold drivers of drone warfare, autonomous vehicles, and cell phones push the development of neuromorphic microprocessors that implement actual neural networks, rather than simulations of neural networks, directly in silicon (and other potential substrates); and from the top down, as our largest and most successful enterprises increasingly turn to analog computation in their infiltration and control of the world. While we argue about the intelligence of digital computers, analog computing 1s quietly supervening upon the digital, in the same way that analog components like vacuum tubes were repurposed to build digital computers in the aftermath of World War II. Individually deterministic finite-state processors, running finite codes, are forming large-scale, nondeterministic, non-finite-state metazoan organisms running wild in the real world. The resulting hybrid analog/digital systems treat streams of bits collectively, the way the flow of electrons is treated in a vacuum tube, rather than individually, as bits are treated by the discrete-state devices generating the flow. Bits are the new electrons. Analog is back, and its nature is to assume control. Governing everything from the flow of goods to the flow of traffic to the flow of ideas, these systems operate statistically, as pulse-frequency coded information is processed in a neuron or a brain. The emergence of intelligence gets the attention of Homo sapiens, but what we should be worried about is the emergence of control. 38 HOUSE_OVERSIGHT_016841
Imagine it is 1958 and you are trying to defend the continental United States against airborne attack. To distinguish hostile aircraft, one of the things you need, besides a network of computers and early-warning radar sites, is a map of all commercial air traffic, updated in real time. The United States built such a system and named it SAGE (Semi-Automatic Ground Environment). SAGE in turn spawned Sabre, the first integrated reservation system for booking airline travel in real time. Sabre and its progeny soon became not just a map as to what seats were available but also a system that began to control, with decentralized intelligence, where airliners would fly, and when. But isn’t there a control room somewhere, with someone at the controls? Maybe not. Say, for example, you build a system to map highway traffic in real time, simply by giving cars access to the map in exchange for reporting their own speed and location at the time. The result is a fully decentralized control system. Nowhere is there any controlling model of the system except the system itself. Imagine it is the first decade of the 21st century and you want to track the complexity of human relationships in real time. For social life at a small college, you could construct a central database and keep it up to date, but its upkeep would become overwhelming if taken to any larger scale. Better to pass out free copies of a simple semi-autonomous code, hosted locally, and let the social network update itself. This code is executed by digital computers, but the analog computing performed by the system as a whole far exceeds the complexity of the underlying code. The resulting pulse-frequency coded model of the social graph becomes the social graph. It spreads wildly across the campus and then the world. What if you wanted to build a machine to capture what everything known to the human species means? With Moore’s Law behind you, it doesn’t take too long to digitize all the information in the world. You scan every book ever printed, collect every email ever written, and gather forty-nine years of video every twenty-four hours, while tracking where people are and what they do, in real time. But how do you capture the meaning? Even in the age of all things digital, this cannot be defined in any strictly logical sense, because meaning, among humans, isn’t fundamentally logical. The best you can do, once you have collected all possible answers, is to invite well-defined questions and compile a pulse-frequency weighted map of how everything connects. Before you know it, your system will not only be observing and mapping the meaning of things, it will start constructing meaning as well. In time, it will control meaning, in the same way as the traffic map starts to control the flow of traffic even though no one seems to be in control. There are three laws of artificial intelligence. The first, known as Ashby’s Law, after cybernetician W. Ross Ashby, author of Design for a Brain, states that any effective control system must be as complex as the system it controls. The second law, articulated by John von Neumann, states that the defining characteristic of a complex system is that it constitutes its own simplest behavioral description. The simplest complete model of an organism is the organism itself. Trying to reduce the system’s behavior to any formal description makes things more complicated, not less. 39 HOUSE_OVERSIGHT_016842
The third law states that any system simple enough to be understandable will not be complicated enough to behave intelligently, while any system complicated enough to behave intelligently will be too complicated to understand. The Third Law offers comfort to those who believe that until we understand intelligence, we need not worry about superhuman intelligence arising among machines. But there is a loophole in the Third Law. It is entirely possible to build something without understanding it. You don’t need to fully understand how a brain works in order to build one that works. This is a loophole that no amount of supervision over algorithms by programmers and their ethical advisors can ever close. Provably “good” AI is a myth. Our relationship with true AI will always be a matter of faith, not proof. We worry too much about machine intelligence and not enough about self- reproduction, communication, and control. The next revolution in computing will be signaled by the rise of analog systems over which digital programming no longer has control. Nature’s response to those who believe they can build machines to control everything will be to allow them to build a machine that controls them instead. 40 HOUSE_OVERSIGHT_016843
Dan Dennett is the philosopher of choice in the AI community. He is perhaps best known in cognitive science for his concept of intentional systems and his model of human consciousness, Which sketches a computational architecture for realizing the stream of consciousness in the massively parallel cerebral cortex. That uncompromising computationalism has been opposed by philosophers such as John Searle, David Chalmers, and the late Jerry Fodor, who have protested that the most important aspects of consciousness—intentionality and subjective qualia—cannot be computed. Twenty-five years ago, I was visiting Marvin Minsky, one of the original AI pioneers, and asked him about Dan. “He’s our best current philosopher—the next Bertrand Russell,” said Marvin, adding that unlike traditional philosophers, Dan was a student of neuroscience, linguistics, artificial intelligence, computer science, and psychology: “He’s redefining and reforming the role of the philosopher. Of course, Dan doesn’t understand my Society-of-Mind theory, but nobody’s perfect.” Dan’s view of the efforts of AI researchers to create superintelligent AIs is relentlessly levelheaded. What, me worry? In this essay, he reminds us that Als, above all, should be regarded—and treated—as tools and not as humanoid colleagues. He has been interested in information theory since his graduate school days at Oxford. In fact, he told me that early in his career he was keenly interested in writing a book about Wiener’s cybernetic ideas. As a thinker who embraces the scientific method, one of his charms is his willingness to be wrong. Of a recent piece entitled “What Is Information?” he has announced, “TI stand by it, but it’s under revision. I’m already moving beyond it and realizing there’s a better way of tackling some of these issues.” He will most likely remain cool and collected on the subject of AI research, although he has acknowledged, often, that his own ideas evolve—as anyone ’s ideas should. 41 HOUSE_OVERSIGHT_016844
WHAT CAN WE DO? Daniel C. Dennett Daniel C. Dennett is University Professor and Austin B. Fletcher Professor of Philosophy and director of the Center for Cognitive Studies at Tufts University. He is the author of a dozen books, including Consciousness Explained and, most recently, From Bacteria to Bach and Back: The Evolution of Minds. Many have reflected on the irony of reading a great book when you are too young to appreciate it. Consigning a classic to the a/ready read stack and thereby insulating yourself against any further influence while gleaning only a few ill-understood ideas from it is a recipe for neglect that is seldom benign. This struck me with particular force when I reread Zhe Human Use of Human Beings more than sixty years after my juvenile encounter. We should all make it a regular practice to reread books from our youth, where we are apt to discover clear previews of some of our own later “discoveries” and “inventions,” along with a wealth of insights to which we were bound to be impervious until our minds had been torn and tattered, exercised and enlarged by confrontations with life’s problems. Writing at a time when vacuum tubes were still the primary electronic building blocks and there were only a few actual computers in operation, Norbert Wiener imagined the future we now contend with in impressive detail and with few clear mistakes. Alan Turing’s famous 1950 article “Computing Machinery and Intelligence,” in the philosophy journal Mind, foresaw the development of AI, and so did Wiener, but Wiener saw farther and deeper, recognizing that AI would not just imitate—and replace—human beings in many intelligent activities but change human beings in the process. We are but whirlpools in a river of ever-flowing water. We are not stuff that abides, but patterns that perpetuate themselves. (p. 96) When that was written, it could be comfortably dismissed as yet another bit of Heraclitean overstatement. Yeah, yeah, you can never step in the same river twice. But it contains the seeds of the revolution in outlook. Today we know how to think about complex adaptive systems, strange attractors, extended minds, and homeostasis, a change in perspective that promises to erase the “explanatory gap”@ between mind and mechanism, spirit and matter, a gap that is still ardently defended by latter-day Cartesians who cannot bear the thought that we—we ourse/ves—are self-perpetuating patterns of information-bearing matter, not “stuff that abides.” Those patterns are remarkably resilient and self-restoring but at the same time protean, opportunistic, selfish exploiters of whatever new is available to harness in their quest for perpetuation. And here is where things get dicey, as Wiener recognized. When attractive opportunities abound, we are apt to be willing to pay a little and accept some small, even trivial, cost-of-doing-business for access to new powers. And pretty soon we become so dependent on our new tools that we lose the ability to thrive without them. Options become obligatory. 8 Joseph Levine, “Materialism and Qualia: The Explanatory Gap,” Pacific Philosophical Quarterly 64, pp. 354-61 (1983). 42 HOUSE_OVERSIGHT_016845
It’s an old, old story, with many well-known chapters in evolutionary history. Most mammals can synthesize their own vitamin C, but primates, having opted for a diet composed largely of fruit, lost the innate ability. We are now obligate ingesters of vitamin C, but not obligate frugivores like our primate cousins, since we have opted for technology that allows us to make, and take, vitamins as needed. The self-perpetuating patterns that we call human beings are now dependent on clothes, cooked food, vitamins, vaccinations, .. . credit cards, smartphones, and the Internet. And—tomorrow if not already today—AI. Wiener foresaw the problems that Turing and the other optimists have largely overlooked. The real danger, he said, is that such machines, though helpless by themselves, may be used by a human being or a block of human beings to increase their control over the rest of the race or that political leaders may attempt to control their populations by means not of machines themselves but through political techniques as narrow and indifferent to human possibility as if they had, in fact, been conceived mechanically. (p. 181) The power, he recognized, lay primarily in the algorithms, not the hardware they run on, although the hardware of today makes practically possible algorithms that would have seemed preposterously cumbersome in Wiener’s day. What can we say about these “techniques” that are “narrow and indifferent to human possibility”? They have been introduced again and again, some obviously benign, some obviously dangerous, and many in the omnipresent middle ground of controversy. Consider a few of the skirmishes. My late friend Joe Weizenbaum, Wiener’s successor as MIT’s Jeremiah of hi-tech, loved to observe that credit cards, whatever their virtues, also provided an inexpensive and almost foolproof way for the government, or corporations, to track the travels and habits and desires of individuals. The anonymity of cash has been largely underappreciated, except by drug dealers and other criminals, and now it may be going extinct. This may make money laundering a more difficult technical challenge in the future, but the AI pattern finders arrayed against it have the side effect of making us all more transparent to any “block of human beings” that may “attempt to control” us. Looking to the arts, the innovation of digital audio and video recording lets us pay a small price (in the eyes of all but the most ardent audiophiles and film lovers) when we abandon analog formats, and in return provides easy—all too easy?—reproduction of artworks with almost perfect fidelity. But there is a huge hidden cost. Orwell’s Ministry of Truth is now a practical possibility. AI techniques for creating all-but-undetectable forgeries of “recordings” of encounters are now becoming available which will render obsolete the tools of investigation we have come to take for granted in the last hundred and fifty years. Will we simply abandon the brief Age of Photographic Evidence and return to the earlier world in which human memory and trust provided the gold standard, or will we develop new techniques of defense and offense in the arms race of truth? (We can imagine a return to analog film-exposed-to-light, kept in “tamper-proof” systems until shown to juries, etc., but how long would it be before somebody figured out a way to infect such systems with doubt? One of the disturbing lessons of recent experience 1s that the task of destroying a reputation for credibility is much less expensive than the task of protecting such a reputation.) Wiener saw the phenomenon at its most general: “...1n 43 HOUSE_OVERSIGHT_016846
the long run, there is no distinction between arming ourselves and arming our enemies.” (p. 129) The Information Age is also the Dysinformation Age. What can we do? We need to rethink our priorities with the help of the passionate but flawed analyses of Wiener, Weizenbaum, and the other serious critics of our technophilia. A key phrase, it seems to me, is Wiener’s almost offhand observation, above, that “these machines” are “helpless by themselves.” As I have been arguing recently, we’re making tools, not colleagues, and the great danger is not appreciating the difference, which we should strive to accentuate, marking and defending it with political and legal innovations. Perhaps the best way to see what is being missed 1s to note that Alan Turing himself suffered an entirely understandable failure of imagination in his formulation of the famous Turing Test. As everyone knows, it is an adaptation of his “imitation game,” in which a man, hidden from view and communicating verbally with a judge, tries to convince the judge that he is in fact a woman, while a woman, also hidden and communicating with the judge, tries to convince the judge that she is the woman. Turing reasoned that this would be a demanding challenge for a man (or for a woman pretending to be a man), exploiting a wealth of knowledge about how the other sex thinks and acts, what they tend to favor or ignore. Surely (ding!)?, any man who could beat a woman at being perceived to be a woman would be an intelligent agent. What Turing did not foresee is the power of deep-learning AI to acquire this wealth of information in an exploitable form without having to understand it. Turing imagined an astute and imaginative (and hence conscious) agent who cunningly designed his responses based on his detailed “theory” of what women are likely to do and say. Top-down intelligent design, in short. He certainly didn’t think that a man, winning the imitation game, would somehow become a woman; he imagined that there would still be a man’s consciousness guiding the show. The hidden premise in Turing’s almost-argument was: Only a conscious, intelligent agent could devise and control a winning strategy in the imitation game. And so it was persuasive to Turing (and others, including me, still a stalwart defender of the Turing Test) to argue that a “computing machine” that could pass as human in a contest with a human might not be conscious in just the way a human being is, but would nevertheless have to be a conscious agent of some kind. I think this is still a defensible position—the only defensible position—but you have to understand how resourceful and ingenious a judge would have to be to expose the shallowness of the facade that a deep-learning AI (a tool, not a colleague) could present. What Turing didn’t foresee is the uncanny ability of superfast computers to sift mindlessly through Big Data, of which the Internet provides an inexhaustible supply, finding probabilistic patterns in human activity that could be used to pop “authentic’- seeming responses into the output for almost any probe a judge would think to offer. Wiener also underestimates this possibility, seeing the tell-tale weakness of a machine in not being able to take into account the vast range of probability that characterizes the human situation. (p.181) ° The surely alarm (the habit of having a bell ring in your head whenever you see the word in an argument) is described and defended by me in Jntuition Pumps and Other Tools for Thinking (2013). 44 HOUSE_OVERSIGHT_016847
But taking into account that range of probability is just where the new AI excels. The only chink in the armor of AI is that word “vast”; human possibilities, thanks to language and the culture that it spawns, are truly Vast.!° No matter how many patterns we may find with AI in the flood of data that has so far found its way onto the Internet, there are Vastly more possibilities that have never been recorded there. Only a fraction (but not a Vanishing fraction) of the world’s accumulated wisdom and design and repartee and silliness has made it onto the Internet, but probably a better tactic for the judge to adopt when confronting a candidate in the Turing Test is not to search for such items but to create them anew. AI in its current manifestations is parasitic on human intelligence. It quite indiscriminately gorges on whatever has been produced by human creators and extracts the patterns to be found there—including some of our most pernicious habits.'! These machines do not (yet) have the goals or strategies or capacities for self-criticism and innovation to permit them to transcend their databases by reflectively thinking about their own thinking and their own goals. They are, as Wiener says, helpless, not in the sense of being shackled agents or disabled agents but in the sense of not being agents at all—not having the capacity to be “moved by reasons” (as Kant put it) presented to them. It is important that we keep it that way, which will take some doing. One of the flaws in Weizenbaum’s book Computer Power and Human Reason, something I tried in vain to convince him of in many hours of discussion, is that he could never decide which of two theses he wanted to defend: AJ is impossible! or Al is possible but evil! He wanted to argue, with John Searle and Roger Penrose, that “Strong AI” is impossible, but there are no good arguments for that conclusion. After all, everything we now know suggests that, as I have put it, we are robots made of robots made of robots. . . down to the motor proteins and their ilk, with no magical ingredients thrown in along the way. Weizenbaum’s more important and defensible message was that we should not strive to create Strong AI and should be extremely cautious about the AI systems that we can create and have already created. As one might expect, the defensible thesis is a hybrid: A/ (Strong AJ) is possible in principle but not desirable. The AI that’s practically possible is not necessarily evil—unless it is mistaken for Strong AI! The gap between today’s systems and the science-fictional systems dominating the popular imagination is still huge, though many folks, both lay and expert, manage to underestimate it. Let’s consider IBM’s Watson, which can stand as a worthy landmark for our imaginations for the time being. It is the result of a very large-scale R&D process extending over many person-centuries of intelligent design, and as George Church notes in these pages, it uses thousands of times more energy than a human brain (a technological limitation that, as he also notes, may be temporary). Its victory in Jeopardy! was a genuine triumph, made possible by the formulaic restrictions of the Jeopardy! rules, but in order for it to compete, even these rules had to be revised (one of 10 Tn Darwin’s Dangerous Idea, 1995, p. 109, I coined the capitalized version, Vast, meaning Very much more than ASTronomical, and its complement, Vanishing, to replace the usual exaggerations infinite and infinitesimal for discussions of those possibilities that are not officially infinite but nevertheless infinite for all practical purposes. 1 Aylin Caliskan-Islam, Joanna J. Bryson & Arvind Narayanan, “Semantics derived automatically from language corpora contain human-like biases,” Science, 14 April 2017, 356: 6334, pp. 183-6. DOI: 10.1126/science.aal4230. 45 HOUSE_OVERSIGHT_016848
those trade-offs: you give up a little versatility, a little humanity, and get a crowd- pleasing show). Watson is not good company, in spite of misleading ads from IBM that suggest a general conversational ability, and turning Watson into a plausibly multidimensional agent would be like turning a hand calculator into Watson. Watson could be a useful core faculty for such an agent, but more like a cerebellum or an amygdala than a mind—at best, a special-purpose subsystem that could play a big supporting role, but not remotely up to the task of framing purposes and plans and building insightfully on its conversational experiences. Why would we want to create a thinking, creative agent out of Watson? Perhaps Turing’s brilliant idea of an operational test has lured us into a trap: the quest to create at least the illusion of a real person behind the screen, bridging the “uncanny valley.” The danger, here, is that ever since Turing posed his challenge—which was, after all, a challenge to fool the judges—AI creators have attempted to paper over the valley with cutesy humanoid touches, Disneyfication effects that will enchant and disarm the uninitiated. Weizenbaum’s ELIZA was the pioneer example of such superficial illusion- making, and it was his dismay at the ease with which his laughably simple and shallow program could persuade people they were having a serious heart-to-heart conversation that first sent him on his mission. He was right to be worried. If there is one thing we have learned from the restricted Turing Test competitions for the Loebner Prize, it is that even very intelligent people who aren’t tuned in to the possibilities and shortcuts of computer programming are readily taken in by simple tricks. The attitudes of people in AI toward these methods of dissembling at the “user interface” have ranged from contempt to celebration, with a general appreciation that the tricks are not deep but can be potent. One shift in attitude that would be very welcome is a candid acknowledgment that humanoid embellishments are false advertising—something to condemn, not applaud. How could that be accomplished? Once we recognize that people are starting to make life-or-death decisions largely on the basis of “advice” from AI systems whose inner operations are unfathomable in practice, we can see a good reason why those who in any way encourage people to put more trust in these systems than they warrant should be held morally and legally accountable. AI systems are very powerful tools—so powerful that even experts will have good reason not to trust their own judgment over the “Judgments” delivered by their tools. But then, if these tool users are going to benefit, financially or otherwise, from driving these tools through ferra incognita, they need to make sure they know how to do this responsibly, with maximum control and justification. Licensing and bonding operators, just as we license pharmacists (and crane operators!) and other specialists whose errors and misjudgments can have dire consequences, can, with pressure from insurance companies and other underwriters, oblige creators of AI systems to go to extraordinary lengths to search for and reveal weaknesses and gaps in their products, and to train those entitled to operate them. One can imagine a sort of inverted Turing Test in which the judge is on trial; until he or she can spot the weaknesses, the overstepped boundaries, the gaps in a system, no license to operate will be issued. The mental training required to achieve certification as a judge will be demanding. The urge to adopt the intentional stance, our normal tactic whenever we encounter what seems to be an intelligent agent, is almost overpoweringly strong. Indeed, the capacity to resist the allure of treating an apparent person as a person 46 HOUSE_OVERSIGHT_016849
is an ugly talent, reeking of racism or species-ism. Many people would find the cultivation of such a ruthlessly skeptical approach morally repugnant, and we can anticipate that even the most proficient system-users would occasionally succumb to the temptation to “befriend” their tools, if only to assuage their discomfort with the execution of their duties. No matter how scrupulously the AI designers launder the phony “human” touches out of their wares, we can expect novel habits of thought, conversational gambits and ruses, traps and bluffs to arise in this novel setting for human action. The comically long lists of known side effects of new drugs advertised on television will be dwarfed by the obligatory revelations of the sorts of questions that cannot be responsibly answered by particular systems, with heavy penalties for those who “overlook” flaws in their products. It is widely noted that a considerable part of the growing economic inequality in today’s world is due to the wealth accumulated by digital entrepreneurs; we should enact legislation that puts their deep pockets in escrow for the public good. Some of the deepest pockets are voluntarily out in front of these obligations to serve society first and make money secondarily, but we shouldn’t rely on good will alone. We don’t need artificial conscious agents. There is a surfeit of natural conscious agents, enough to handle whatever tasks should be reserved for such special and privileged entities. We need intelligent tools. Tools do not have rights, and should not have feelings that could be hurt, or be able to respond with resentment to “abuses” rained on them by inept users.'* One of the reasons for not making artificial conscious agents is that however autonomous they might become (and in principle, they can be as autonomous, as self-enhancing or self-creating, as any person), they would not—without special provision, which might be waived—share with us natural conscious agents our vulnerability or our mortality. I once posed a challenge to students in a seminar at Tufts I co-taught with Matthias Scheutz on artificial agents and autonomy: Give me the specs for a robot that could sign a binding contract with you—not as a surrogate for some human owner but on its own. This isn’t a question of getting it to understand the clauses or manipulate a pen on a piece of paper but of having and deserving legal status as a morally responsible agent. Small children can’t sign such contracts, nor can those disabled people whose legal status requires them to be under the care and responsibility of guardians of one sort or another. The problem for robots who might want to attain such an exalted status is that, like Superman, they are too invulnerable to be able to make a credible promise. If they were to renege, what would happen? What would be the penalty for promise- breaking? Being locked in a cell or, more plausibly, dismantled? Being locked up is barely an inconvenience for an AI unless we first install artificial wanderlust that cannot be ignored or disabled by the AI on its own (and it would be systematically difficult to make this a foolproof solution, given the presumed cunning and self-knowledge of the AI); and dismantling an AI (either a robot or a bedridden agent like Watson) is not killing it, if the information stored in its design and software is preserved. The very ease of digital recording and transmitting—the breakthrough that permits software and data to be, 1? Joanna J. Bryson, “Robots Should Be Slaves,” in Close Engagement with Artificial Companions, Y orick Wilks, ed., (Amsterdam, The Netherlands: John Benjamins, 2010), pp. 63-74; http://www.cs.bath.ac.uk/~jb/ftp/Bry son-Slaves-Book09. html. , Patiency Is Not a Virtue: AI and the Design of Ethical Systems,” https:/Awww.cs.bath.ac.uk/~jjb/ftp/Bryson-Patiency-AAAISS16.pdf. 47 HOUSE_OVERSIGHT_016850
in effect, immortal—removes robots from the world of the vulnerable (at least robots of the usually imagined sorts, with digital software and memories). If this isn’t obvious, think about how human morality would be affected if we could make “backups” of people every week, say. Diving headfirst on Saturday off a high bridge without benefit of bungee cord would be a rush that you wouldn’t remember when your Friday night backup was put online Sunday morning, but you could enjoy the videotape of your apparent demise thereafter. So what we are creating are not—should not be—conscious, humanoid agents but an entirely new sort of entities, rather like oracles, with no conscience, no fear of death, no distracting loves and hates, no personality (but all sorts of foibles and quirks that would no doubt be identified as the “personality” of the system): boxes of truths (if we’ re lucky) almost certainly contaminated with a scattering of falsehoods. It will be hard enough learning to live with them without distracting ourselves with fantasies about the Singularity in which these Als will enslave us, literally. The Auman use of human beings will soon be changed—once again—forever, but we can take the tiller and steer between some of the hazards if we take responsibility for our trajectory. 48 HOUSE_OVERSIGHT_016851
The roboticist Rodney Brooks, featured in Errol Morris’s 1997 documentary Fast, Cheap and Out of Control along with a lion-tamer, a topiarist, and an expert on the naked mole rat, was described by one reviewer as “smiling with a wild gleam in his eye.’ But that’s pretty much true of most visionaries. A few years later in his career, Brooks, as befits one of the world’s leading roboticists, suggested that “we overanthropomorphize humans, who are after all mere machines.” He went on to present a warm-hearted vision of a coming Al world in which “the distinction between us and robots is going to disappear.” He also admitted to something of a divided worldview. “Like a religious scientist, I maintain two sets of inconsistent beliefs and act on each of them in different circumstances,” he wrote. “It is this transcendence between belief systems that I think will be what enables mankind to ultimately accept robots as emotional machines, and thereafter start to empathize with them and attribute free will, respect, and ultimately rights to them.” That was in 2002. In these pages, he takes a somewhat more jaundiced, albeit narrower, view; he is alarmed by the extent to which we have come to rely on pervasive systems that are not just exploitative but also vulnerable, as a result of the too-rapid development of software engineering—an advance that seems to have outstripped the imposition of reliably effective safeguards. J 49 HOUSE_OVERSIGHT_016852
THE INHUMAN MESS OUR MACHINES HAVE GOTTEN US INTO Rodney Brooks Rodney Brooks is a computer scientist; Panasonic Professor of Robotics, emeritus, MIT; former director, MIT Computer Science Lab; and founder, chairman, and CTO of Rethink Robotics. He is the author of Flesh and Machines. Mathematicians and scientists are often limited in how they see the big picture, beyond their particular field, by the tools and metaphors they use in their work. Norbert Wiener is no exception, and I might guess that neither am I. When he wrote The Human Use of Human Beings, Wiener was straddling the end of the era of understanding machines and animals simply as physical processes and the beginning of our current era of understanding machines and animals as computational processes. I suspect there will be future eras whose tools will look as distinct from the tools of the two eras Wiener straddled as those tools did from each other. Wiener was a giant of the earlier era and built on the tools developed since the time of Newton and Leibniz to describe and analyze continuous processes in the physical world. In 1948 he published Cybernetics, a word he coined to describe the science of communication and control in both machines and animals. Today we would refer to the ideas in this book as control theory, an indispensable discipline for the design and analysis of physical machines, while mostly neglecting Wiener’s claims about the science of communication. Wiener’s innovations were largely driven by his work during the Second World War on mechanisms to aim and fire anti-aircraft guns. He brought mathematical rigor to the design of the sorts of technology whose design processes had been largely heuristic in nature: from the Roman waterworks through Watt’s steam engine to the early development of automobiles. One can imagine a different contingent version of our intellectual and technological history had Alan Turing and John von Neumann, both of whom made major contributions to the foundations of computing, not appeared on the scene. Turing contributed a fundamental model of computation—now known as a Turing Machine—in his paper “On Computable Numbers with an Application to the Entscheidungsproblem,” written and revised in 1936 and published in 1937. In these machines, a linear tape of symbols from a finite alphabet encodes the input for a computational problem and also provides the working space for the computation. A different machine was required for each separate computational problem; later work by others would show that in one particular machine, now known as a Universal Turing Machine, an arbitrary set of computing instructions could be encoded on that same tape. In the 1940s, von Neumann developed an abstract self-reproducing machine called a cellular automaton. In this case it occupied a finite subset of an infinite two- dimensional array of squares each containing a single symbol from a finite alphabet of twenty-nine distinct symbols—the rest of the infinite array starts out blank. The single symbols in each square change in lockstep, based on a complex but finite rule about the current symbol in that square and its immediate neighbors. Under the complex rule that von Neumann developed, most of the symbols in most of the squares stay the same and a few change at each step. So when one looks at the non-blank squares, it appears that 50 HOUSE_OVERSIGHT_016853
there is a constant structure with some activity going on inside it. When von Neumann’s abstract machine reproduced, it made a copy of itself in another region of the plane. Within the “machine” was a horizontal line of squares which acted as a finite linear tape, using a subset of the finite alphabet. It was the symbols in those squares that encoded the machine of which they were a part. During the machine’s reproduction, the “tape” could move either left or right and was both interpreted (transcribed) as the instructions (translation) for the new “machine” being built and then copied (replicated)—with the new copy being placed inside the new machine for further reproduction. Francis Crick and James Watson later showed, in 1953, how such a tape could be instantiated in biology by along DNA molecule with its finite alphabet of four nucleobases: guanine, cytosine, adenine, and thymine (G, C, A, and T).'* Asin von Neumann’s machine, in biological reproduction the linear sequence of symbols in DNA is interpreted—through transcription into RNA molecules, which then are translated into proteins, the structures that make up a new cell—and the DNA is replicated and encased in the new cell. A second foundational piece of work was in a 1945 “First Draft” report on the design for a digital computer, wherein von Neumann advocated for a memory that could contain both instructions and data.'4 This is now known as a von Neumann architecture computer—as distinct from a Harvard architecture computer, where there are two separate memories, one for instructions and one for data. The vast majority of computer chips built in the era of Moore’s Law are based on the von Neumann architecture, including those powering our data centers, our laptops, and our smartphones. Von Neumann’s digital-computer architecture 1s conceptually the same generalization—from early digital computers constructed with electromagnetic relays at both Harvard University and Bletchley Park—that occurs in going from a special-purpose Turing Machine to a Universal Turing Machine. Furthermore, his self-replicating automata share a fundamental similarity with both the construction of a Turing Machine and the mechanism of DNA-based reproducing biological cells. There is to this day scholarly debate over whether von Neumann saw the cross connections between these three pieces of work, Turing’s and his two. Turing’s revision of his paper was done while he and von Neumann were both at Princeton; indeed, after getting his PhD, Turing almost stayed on as von Neumann’s postdoc. Without Turing and von Neumann, the cybernetics of Wiener might have remained a dominant mode of thought and driver of technology for much longer than its brief moment of supremacy. In this imaginary version of history, we might well live today in an actual steam-punk world and not just get to observe its fantastical instantiations at Maker Faires! My point is that Wiener thought about the world—physical, biological, and (in Human Use) sociological—in a particular way. He analyzed the world as continuous variables, as he explains in chapter 1 along with a nod to thermodynamics through an overlay of Gibbs statistics. He also shoehorns in a weak and unconvincing model of information as message-passing between and among both physical and biological entities. To me, and from today’s vantage point seventy years on, his tools seem woefully 13“ Structure for Deoxyribose Nucleic Acid,” Nature 171, 737-738 (1953). 14 hitps://en.wikipedia.org//wiki/First_Draft_of_a_Report_on_ the EDVAC#Controversy. Von Neumann is listed as the only author, whereas others contributed to the concepts he laid out; thus credit for the architecture has gone to him alone. 51 HOUSE_OVERSIGHT_016854
inadequate for describing the mechanisms underlying biological systems, and so he missed out on how similar mechanisms might eventually be embodied in technological computational systems—as now they have been. Today’s dominant technologies were developed in the world of Turing and von Neumann, rather than the world of Wiener. In the first industrial revolution, energy from a steam engine or a water wheel was used by human workers to replace their own energy. Instead of being a source of energy for physical work, people became modulators of how a large source of energy was used. But because steam engines and water wheels had to be large to be an efficient use of capital, and because in the 18th century the only technology for spatial distribution of energy was mechanical and worked only at very short range, many workers needed to be crowded around the source of energy. Wiener correctly argues that the ability to transmit energy as electricity caused a second industrial revolution. Now the source of energy could be distant from where it was used, and from the beginning of the 20th century, manufacturing could be much more dispersed as electrical-distribution grids were built. Wiener then argues that a further new technology, that of the nascent computational machines of his time, will provide yet another revolution. The machines he talks about seem to be both analog and (perhaps) digital in nature; and he points out, in The Human Use of Human Beings, that since they will be able to make decisions, both blue-collar and white-collar workers may be reduced to being mere cogs in a much bigger machine. He fears that humans might use and abuse one another through organizational structures that this capability will encourage. We have certainly seen this play out in the last sixty years, and that disruption 1s far from over. However, his physics-based view of computation protected him from realizing just how bad things might get. He saw machines’ ability to communicate as providing a new and more inhuman way of exerting command and control. He missed that within a few decades computation systems would become more like biological systems, and it seems, from his descriptions in chapter 10 of his own work on modeling some aspects of biology, that he woefully underappreciated the many orders of magnitude of further complexity of biology over physics. We are in a much more complex situation today than he foresaw, and I am worried that it is much more pernicious than even his worst imagined fears. In the 1960s, computation became firmly based on the foundations set out by Turing and von Neumann, and it was digital computation, based on the idea of finite alphabets which they both used. An arbitrarily long sequence, or string, formed by characters from a finite alphabet, can be encoded as a unique integer. As with Turing Machines themselves, the formalism for computation became that of computing an integer-valued function of a single integer-valued input. Turing and von Neumann both died in the 1950s and at that time this is how they saw computation. Neither foresaw the exponential increase in computing capability that Moore’s Law would bring—nor how pervasive computing machinery would become. Nor did they foresee two developments in our modeling of computation, each of which poses a great threat to human society. The first is rooted in the abstractions they adopted. In the fifty-year, Moore’s Law-fueled race to produce software that could exploit the doubling of computer capability every two years, the typical care and certification of engineering disciplines was thrown by the wayside. Software engineering was fast and prone to failures. This 52 HOUSE_OVERSIGHT_016855
rapid development of software without standards of correctness has opened up many routes to exploit von Neumann architecture’s storage of data and instructions in the same memory. One of the most common routes, known as “buffer overrun,” involves an input number (or long string of characters) that is bigger than the programmer expected and overflows into where the instructions are stored. By carefully designing an input number that is too big by far, someone using a piece of software can infect it with instructions not intended by the programmer, and thus change what it does. This is the basis for creating a computer virus—so named for its similarity to a biological virus. The latter injects extra DNA into a cell, and that cell’s transcription and translation mechanism blindly interprets it, making proteins that may be harmful to the host cell. Furthermore, the replication mechanism for the cell takes care of multiplying the virus. Thus, a small foreign entity can take control of a much bigger entity and bend its behavior in unexpected ways. These and other forms of digital attacks have taken the security of our everyday lives from us. We rely on computers for almost everything now. We rely on computers for our infrastructure of electricity, gas, roads, cars, trains, and airplanes; these are all vulnerable. We rely on computers for our banking, our payment of bills, our retirement accounts, our mortgages, our purchasing of goods and services—these, too, are all vulnerable. We rely on computers for our entertainment, our communications both business and personal, our physical security at home, our information about the world, and our voting systems—all vulnerable. None of this will get fixed anytime soon. In the meantime, many aspects of our society are open to vicious attacks, whether by freelancing criminals or nation-state adversaries. The second development is that computation has gone beyond simply computing functions. Instead, programs remain online continuously, and so they can gather data about a sequence of queries. Under the Wiener/Turing/von Neumann scheme, we might think of the communication pattern for a Web browser to be: User: Give me Web page A. Browser: Here is Web page A. User: Give me Web page B. Browser: Here is Web page B. Now instead it can look like this: User: Give me Web page A. Browser: Here is Web page A. [And I will secretly remember that you asked for Web page A.] User: Give me Web page B. Browser: Here is Web page B. [I see a correlation between its contents and that of the earlier requested Web page A, so Iwill update my model of you, the user, and transmit it to the company that produced me. ] 53 HOUSE_OVERSIGHT_016856
When the machine no longer simply computes a function but instead maintains a state, it can start to make inferences about the human by the sequence of requests presented to it. And when different programs correlate across different request streams— say, correlating Web-page searches with social-media posts, or the payment for services on another platform, or the dwell time on a particular advertisement, or where the user has walked or driven with their GPS-enabled smartphone—the total systems of many programs communicating with one another and with databases leads to a whole new loss of privacy. The great exploitative leap made by so many West Coast companies has been to monetize those inferences without the knowing permission of the person generating the interactions with the computing machine platforms. Wiener, Turing, and von Neumann could not foresee the complexity of those platforms, wherein the legal mumbo-jumbo of the terms-of-use contracts the humans willingly enter into, without an inkling of what they entail, leads them to give up rights they would never concede in a one-on-one interaction with another human being. The computation platforms have become a shield behind which some companies hide in order to inhumanly exploit others. In certain other countries, the governments carry out these manipulations, and there the goal 1s not profits but the suppression of dissent. Humankind has gotten itself into a fine pickle: We are being exploited by companies that paradoxically deliver services we crave, and at the same time our lives depend on many software-enabled systems that are open to attack. Getting ourselves out of this mess will be a long-term project. It will involve engineering, legislation, and most important, moral leadership. Moral leadership is the first and biggest challenge. 54 HOUSE_OVERSIGHT_016857
I first met Frank Wilczek in the 1980s, when he invited me to his home in Princeton to talk about anyons. “The address is 112 Mercer Street,” he wrote. “Look for the house with no driveway.” So there I was, a few hours later, in Einstein’s old living room, talking to a future recipient of the Nobel Prize in physics. If Frank was as impressed as I was by the surroundings, you’d never guess it. His only comment concerned the difficulty of finding a parking place in front of a “house with no driveway.” Unlike most theoretical physicists, Frank has long had a keen interest in Al, as witnessed in these three “Observations”’: 1. “Francis Crick called it ‘the Astonishing Hypothesis’: that consciousness, also known as Mind, is an emergent property of matter,” which, if true, indicates that “all intelligence is machine intelligence. What distinguishes natural from artificial intelligence is not what it is, but only how it is made.” 2. “Artificial intelligence is not the product of an alien invasion. It is an artifact of a particular human culture and reflects the values of that culture.” 3. “David Hume's striking statement ‘Reason Is, and Ought only to Be, the Slave of the Passions’ was written in 1738 [and] was, of course, meant to apply to human reason and human passions. ... But Hume’s logical/philosophical point remains valid for Al. Simply put: Incentives, not abstract logic, drive behavior.” He notes that “the big story of the 20th and the 21st century is that [as] computing develops, we learn how to calculate the consequences of the [fundamental] laws better and better. There’s also a feedback cycle: When you understand matter better, you can design better computers, which will enable you to calculate better. It’s kind of an ascending helix.” Here he argues that human intelligence, for now, holds the advantage—yet our future, unbounded by our solar system and doubtless also by our galaxy, will never be realized without the help of our AIs. 55 HOUSE_OVERSIGHT_016858
THE UNITY OF INTELLIGENCE Frank Wilczek Frank Wilczek is Herman Feshbach Professor of Physics at MIT, recipient of the 2004 Nobel Prize in physics, and the author of A Beautiful Question: Finding Nature’s Deep Design. I. A Simple Answer to Contentious Questions: e Can an artificial intelligence be conscious? e Can an artificial intelligence be creative? e Can an artificial intelligence be evil? Those questions are often posed today, both in popular media and in scientifically informed debates. But the discussions never seem to converge. Here I’ll begin by answering them as follows: Based on physiological psychology, neurobiology, and physics, it would be very surprising if the answers were not Yes, Yes, and Yes. The reason is simple, yet profound: Evidence from those fields makes it overwhelmingly likely that there is no sharp divide between natural and artificial intelligence. In his 1994 book of that title, the renowned biologist Francis Crick proposed an “astonishing hypothesis”: that mind emerges from matter. He famously claimed that mind, in all its aspects, 1s “no more than the behavior of a vast assembly of nerve cells and their associated molecules.” The “astonishing hypothesis” is in fact the foundation of modern neuroscience. People try to understand how minds work by understanding how brains function; and they try to understand how brains function by studying how information is encoded in electrical and chemical signals, transformed by physical processes, and used to control behavior. In that scientific endeavor, they make no allowance for extraphysical behavior. So far, in thousands of exquisite experiments, that strategy has never failed. It has never proved necessary to allow for the influence of consciousness or creativity unmoored from brain activity to explain any observed fact of psychophysics or neurobiology. No one has ever stumbled upon a power of mind which is separate from conventional physical events in biological organisms. While there are many things we do not understand about brains, and about minds, the “astonishing hypothesis” has held intact. If we broaden our view beyond neurobiology to consider the whole range of scientific experimentation, the case becomes still more compelling. In modern physics, the foci of interest are often extremely delicate phenomena. To investigate them, experimenters must take many precautions against contamination by “noise.” They often find it necessary to construct elaborate shielding against stray electric and magnetic fields; to compensate for tiny vibrations due to micro-earthquakes or passing cars; to work at extremely low temperatures and in high vacuum, and so forth. But there’s a notable exception: They have never found it necessary to make allowances for what people nearby (or, for that matter, far away) are thinking. No “thought waves,” separate from known physical processes yet capable of influencing physical events, seem to exist. That conclusion, taken at face value, erases the distinction between natural and artificial intelligence. It implies that if we were to duplicate, or accurately simulate, the physical processes occurring in a brain—as, in principle, we can—and wire up its input 56 HOUSE_OVERSIGHT_016859
and output to sense organs and muscles, then we would reproduce, in a physical artifact, the observed manifestations of natural intelligence. Nothing observable would be missing. As an observer, I'd have no less (and no more) reason to ascribe consciousness, creativity, or evil to that artifact than I do to ascribe those properties to its natural counterparts, like other human beings. Thus, by combining Crick’s “astonishing hypothesis” in neurobiology with powerful evidence from physics, we deduce that natural intelligence is a special case of artificial intelligence. That conclusion deserves a name, and I will call it “the astonishing corollary.” With that, we have the answer to our three questions. Since consciousness, creativity, and evil are obvious features of natural human intelligence, they are possible features of artificial intelligence. A hundred years ago, or even fifty, to believe the hypothesis that mind emerges from matter, and to infer our corollary that natural intelligence is a special case of artificial intelligence, would have been leaps of faith. In view of the many surrounding gaps—chasms, really—in contemporary understanding of biology and physics, they were genuinely doubtful propositions. But epochal developments in those areas have changed the picture: In biology: A century ago, not only thought but also metabolism, heredity, and perception were deeply mysterious aspects of life that defied physical explanation. Today, of course, we have extremely rich and detailed accounts of metabolism, heredity, and many aspects of perception, from the bottom up, starting at the molecular level. In physics: After a century of quantum physics and its application to materials, physicists have discovered, over and over, how rich and strange the behavior of matter can be. Superconductors, lasers, and many other wonders demonstrate that large assemblies of molecular units, each simple in itself, can exhibit qualitatively new, “emergent” behavior, while remaining fully obedient to the laws of physics. Chemistry, including biochemistry, is a cornucopia of emergent phenomena, all now quite firmly grounded in physics. The pioneering physicist Philip Anderson, in an essay titled “More Is Different,” offers a classic discussion of emergence. He begins by acknowledging that “the reductionist hypothesis [1.e., the completeness of physical explanations based on known interactions of simple parts] may still be a topic for controversy among philosophers, but among the great majority of active scientists I think it is accepted without question.” But he goes on to emphasize that “[t]he behavior of large and complex aggregates of elementary particles, it turns out, is not to be understood in terms of a simple extrapolation of the properties of a few particles.”!> Each new level of size and complexity supports new forms of organization, whose patterns encode information in new ways and whose behavior is best described using new concepts. Electronic computers are a magnificent example of emergence. Here, all the cards are on the table. Engineers routinely design, from the bottom up, based on known (and quite sophisticated) physical principles, machines that process information in extremely impressive ways. Your iPhone can beat you at chess, quickly collect and deliver information about anything, and take great pictures, too. Because the process whereby computers, smartphones, and other intelligent objects are designed and manufactured is completely transparent, there can be no doubt that their wonderful 1S Science, 4 August 1972, Vol. 177, No. 4047, pp. 393-96. 57 HOUSE_OVERSIGHT_016860
capabilities emerge from regular physical processses, which we can trace down to the level of electrons, photons, quarks, and gluons. Evidently, brute matter can get pretty smart. Let me summarize the argument. From two strongly supported hypotheses, we’ve drawn a straightforward conclusion: e Human mind emerges from matter. e Matter is what physics says it is. e Therefore, the human mind emerges from physical processes we understand and can reproduce artificially. e Therefore, natural intelligence is a special case of artificial intelligence. Of course, our “astonishing corollary” could fail; the first two lines of this argument are hypotheses. But their failure would have to bring in a foundation-shattering discovery—a significant new phenomenon, with large-scale physical consequences, which takes place in unremarkable, well-studied physical circumstances (1.e., the materials, temperatures, and pressures inside human brains) yet which has somehow managed for many decades to elude determined investigators armed with sophisticated instruments. Such a discovery would be. . . astonishing. ID. The Future of Intelligence It is part of human nature to improve on human bodies and minds. Historically, clothing, eyeglasses, and watches are examples of increasingly sophisticated augmentations that enhance our toughness, perception, and awareness. They are major improvements to the natural human endowment, whose familiarity should not blind us to their depth. Today smartphones and the Internet are bringing the human drive toward augmentation into realms more central to our identity as intelligent beings. They are giving us, in effect, quick access to a vast collective awareness and a vast collective memory. At the same time, autonomous artificial intelligences have become world champions in a wide variety of “cerebral” games, such as chess and Go, and have taken over many sophisticated pattern-recognition tasks, such as reconstructing what happened during complex reactions at the Large Hadron Collider from a blizzard of emerging particle tracks, to find new particles; or gathering clues from fuzzy X-ray, fMRI, and other types of images, to diagnose medical problems. Where is this drive toward self-enhancement and innovation taking us? While the precise sequence of events and the timescale over which they’ll play out is impossible to predict (or, at least, beyond me), some basic considerations suggest that eventually the most powerful embodiments of mind will be quite different things from human brains as we know them today. Consider six factors whereby information-processing technology exceeds human capabilities—vastly, qualitatively, or both: e Speed: The orchestrated motion of electrons, which is the heart of modern artificial information-processing, can be much faster than the processes of diffusion and chemical change by which brains operate. Typical modern computer clock rates approach 10 gigahertz, corresponding to 10 billion operations per second. No single measure of speed applies to the bewildering variety of brain processes, but one fundamental limitation is latency of action 58 HOUSE_OVERSIGHT_016861
potentials, which limits their spacing to a few 10s per second. It is probably no accident that the “frame rate,” at which we can distinguish that movies are actually a sequence of stills, is about 40 per second. Thus, electronic processing is close to a billion times faster. e Size: The linear dimension of a typical neuron is about 10 microns. Molecular dimensions, which set a practical limit, are about 10,000 times smaller, and artificial processing units are approaching that scale. Smallness makes communication more efficient. e Stability: Whereas human memory 1s essentially continuous (analog), artificial memory can incorporate discrete (digital) features. Whereas analog quantities can erode, digital quantities can be stored, refreshed, and maintained with complete accuracy. e Duty Cycle: Human brains grow tired with effort. They need time off to take nourishment and to sleep. They carry the burden of aging. Most profoundly: They die. e Modularity (open architecture): Because artificial information processors can support precisely defined digital interfaces, they can readily assimilate new modules. Thus, if we want a computer to “see” ultraviolet or infrared or “hear” ultrasound, we can feed the output from an appropriate sensor directly into its “nervous system.” The architecture of brains is much more closed and opaque, and the human immune system actively resists implants. e Quantum readiness: One case of modularity deserves special mention, because of its long-term potential. Recently physicists and information scientists have come to appreciate that the principles of quantum mechanics support new computing principles, which can empower qualitatively new forms of information processing and (plausibly) new levels of intelligence. But these possibilities rely on aspects of quantum behavior which are quite delicate and seem especially unsuitable for interfacing with the warm, wet, messy enviroment of human brains. Evidently, as platforms for intelligence, human brains are far from optimal. Still, although versatile housekeeping robots or mechanical soldiers would find ready, lucrative markets, at present there is no machine that approaches the kind of general-purpose human intelligence those applications would require. Despite their relative weakness on many fronts, human brains have some big advantages over their artificial competitors. Let me mention five: e Three-dimensionality: Although, as noted, the linear dimensions of existing artificial processing units are vastly smaller than those of brains, the procedure by which they’re made—centered on lithography (basically, etching)—is essentially two-dimensional. That is revealed visibly in the geometry of computer boards and chips. Of course, one can stack boards, but the spacing between layers is much larger, and communication much less efficient, than within layers. Brains make better use of all three dimensions. e Self-repair: Human brains can recover from, or work around, many kinds of injuries or errors. Computers often must be repaired or rebooted externally. 59 HOUSE_OVERSIGHT_016862
e Connectivity: Human neurons typically support several hundred connections (synapses). Moreover, the complex pattern of these connections is very meaningful. (See our next point.) Computer units typically make only a handful of connections, in regular, fixed patterns. e Development (self-assembly with interactive sculpting): The human brain grows its units by cell divisions and orchestrates them into coherent structures by movement and folding. It also proliferates an abundance of connections among the cells. An important part of its sculpting occurs through active processes during infancy and childhood, as the individual interacts with his or her environment. In this process, many connections are winnowed away, while others are strengthened, depending on their effectiveness in use. Thus, the fine structure of the brain is tuned through interaction with the external world—a rich source of information and feedback! e /ntegration (sensors and actuators): The human brain comes equipped with a variety of sensory organs, notably including its outgrowth eyes, and with versatile actuators, including hands that build, legs that walk, and mouths that speak. Those sensors and actuators are seamlessy integrated into the brain’s information- processing centers, having been honed over millions of years of natural selection. We interpret their raw signals and control their large-scale actions with minimal conscious attention. The flip side is that we don’t know how we do it, and the implementation is opaque. It’s proving surprisingly difficult to reach human standards on these “routine” input-output functions. These advantages of human brains over currently engineered artifacts are profound. Human brains supply an inspiring existence proof, showing us several ways we can get more out of matter. When, if ever, will our engineering catch up? I don’t know for sure, but let me offer some informed opinions. The challenges of three-dimensionality and, to a lesser extent, self-repair don’t look overwhelming. They present some tough engineering problems, but many incremental improvements are easy to imagine, and there are clear paths forward. And while the powers of human eyes, hands, and other sensory organs and actuators are wonderfully effective, their abilities are far from exhausting any physical limits. Optical systems can take pictures with higher resolution in space, time, and color, and in more regions of the electromagnetic spectrum; robots can move faster and be stronger; and so forth. In these domains, the components necessary for superhuman performance, along many axes, are already available. The bottleneck is getting information into and out of them, rapidly, in the language of the information-processing units. And this brings us to the remaining, and I think most profound, advantages of brains over artificial devices, which stem from their connectivity and interactive development. Those two advantages are synergistic, since it is interactive development that sculpts the massively wired but sprawling structure of the infant brain, enabled by exponential growth of neurons and synapses, to get tuned into the extraordinary instrument it becomes. Computer scientists are beginning to discover the power of the brain’s architecture: Neural nets, whose basic design, as their name suggests, was directly inspired by the brain’s, have scored some spectacular successes in game playing and pattern recognition, as noted. But present-day engineering has nothing comparable—in 60 HOUSE_OVERSIGHT_016863
the (currently) esoteric domain of self-reproducing machines—to the power and versatility of neurons and their synapses. This could become a new, great frontier of research. Here too, biology might point the way, as we come to understand biological development well enough to imitate its essence. Altogether, the advantages of artificial over natural intelligence appear permanent, while the advantages of natural over artificial intelligence, though substantial at present, appear transient. I’d guess that it will be many decades before engineering catches up, but—barring catastrophic wars, climate change, or plagues, so that technological progress stays vigorous—few centuries. If that’s right, we can look forward to several generations during which humans, empowered and augmented by smart devices, coexist with increasingly capable autonomous Als. There will be a complex, rapidly changing ecology of intelligence, and rapid evolution in consequence. Given the intrinsic advantages that engineered devices will eventually offer, the vanguard of that evolution will be cyborgs and superminds, rather than lightly adorned Homo sapiens. Another important impetus will come from the exploration of hostile environments, both on Earth (e.g., the deep ocean) and, especially, in space. The human body is poorly adapted to conditions outside a narrow band of temperatures, pressures, and atmospheric composition. It needs a wide variety of specific, complex nutrients, and plenty of water. Also, it is not radiation-hardened. As the manned space program has amply demonstrated, it is difficult and expensive to maintain humans outside their terrestrial comfort zone. Cyborgs or autonomous AIs could be much more effective in these explorations. Quantum Als, with their sensitivity to noise, might even be happier in the cold and dark of deep space. In a moving passage from his 1935 novel Odd John, science fiction’s singular genius Olaf Stapledon has his hero, a superhuman (mutant) intelligence, describe Homo sapiens as “the Archaeopteryx of the spirit.” He says this, fondly, to his friend and biographer, who is a normal human. Archaeopteryx was a noble creature, and a bridge to greater ones. 61 HOUSE_OVERSIGHT_016864
I was introduced to Max Tegmark some years ago by his MIT colleague Alan Guth, the father of the inflationary universe. A distinguished theoretical physicist and cosmologist himself, Max’s principal concern nowadays is the looming existential risk posed by the creation of an AGI (artificial general intelligence—that is, one that matches human intelligence). Four years ago, Max co-founded, with Jaan Tallinn and others, the Future of Life Institute (FL), which bills itself as “an outreach organization working to ensure that tomorrow ’s most powerful technologies are beneficial for humanity.” While ona book tour in London, he was in the midst of planning for FLI, and he admits being driven to tears in a tube station after a trip to the London Science Museum, with its exhibitions spanning the gamut of humanity’s technological achievements. Was all that impressive progress in vain? FLI’s scientific advisory board includes Elon Musk, Frank Wilczek, George Church, Stuart Russell, and the Oxford philosopher Nick Bostrom, who dreamed up an oft-quoted Gedankenexperiment that results in a world full of paper clips and nothing else, produced by an (apparently) well-meaning AGI who was just following orders. The Institute sponsors conferences (Puerto Rico 2015, Asilomar 2017) on AI safety issues and in 2018 instituted a grants competition focusing on research in aid of maximizing the societal benefits of AGI. While Max is sometimes listed—by the non-cognoscenti—on the side of the scaremongers, he believes, like Frank Wilczek, ina future that will immensely benefit jrom AGI if in the attempt to create it, we can keep the human species from being sidelined. 62 HOUSE_OVERSIGHT_016865
LET’S ASPIRE TO MORE THAN MAKING OURSELVES OBSOLETE Max Tegmark Max Tegmark is an MIT physicist and AI researcher; president of the Future of Life Institute; scientific director of the Foundational Questions Institute; and the author of Our Mathematical Universe and Life 3.0: Being Human in the Age of Artificial Intelligence. Although there’s great controversy about how and when AI will impact humanity, the situation is clearer from a cosmic perspective: The technology-developing life that has evolved on Earth is rushing to make itself obsolete without devoting much serious thought to the consequences. This strikes me as embarrassingly lame, given that we can create amazing opportunities for humanity to flourish like never before, if we dare to steer a more ambitious course. 13.8 billion years after its birth, our Universe has become aware of itself. On a small blue planet, tiny conscious parts of our Universe have discovered that what they once thought was the sum total of existence was a minute part of something far grander: a solar system in a galaxy in a universe with over 100 billion other galaxies, arranged into an elaborate pattern of groups, clusters, and superclusters. Consciousness is the cosmic awakening; it transformed our Universe from a mindless zombie with no self-awareness into a living ecosystem harboring self-reflection, beauty, hope, meaning, and purpose. Had that awakening never taken place, our Universe would have been pointless—a gigantic waste of space. Should our Universe go back to sleep permanently due to some cosmic calamity or self-inflicted mishap, it will become meaningless again. On the other hand, things could get even better. We don’t yet know whether we humans are the only stargazers in the cosmos, or even the first, but we’ve already learned enough about our Universe to know that it has the potential to wake up much more fully than it has thus far. AI pioneers such as Norbert Wiener have taught us that a further awakening of our Universe’s ability to process and experience information need not require eons of additional evolution but perhaps mere decades of human scientific ingenuity. We may be like that first glimmer of self-awareness you experienced when you emerged from sleep this morning, a premonition of the much greater consciousness that would arrive once you opened your eyes and fully awoke. Perhaps artificial superintelligence will enable life to spread throughout the cosmos and flourish for billions or trillions of years, and perhaps this will be because of decisions we make here, on our planet, in our lifetime. Or humanity may soon go extinct, through some self-inflicted calamity caused by the power of our technology growing faster than the wisdom with which we manage it. The evolving debate about AI’s societal impact Many thinkers dismiss the idea of superintelligence as science fiction, because they view intelligence as something mysterious that can exist only in biological organisms— especially humans—and as fundamentally limited to what today’s humans can do. But from my perspective as a physicist, intelligence is simply a certain kind of information 63 HOUSE_OVERSIGHT_016866
processing performed by elementary particles moving around, and there’s no law of physics that says one can’t build machines more intelligent in every way than we are, and able to seed cosmic life. This suggests that we’ve seen just the tip of the intelligence iceberg; there’s an amazing potential to unlock the full intelligence latent in nature and use it to help humanity flourish—or flounder. Others, including some of the authors in this volume, dismiss the building of an AGI (Artificial General Intelligence—an entity able to accomplish any cognitive task at least as well as humans) not because they consider it physically impossible but because they deem it too difficult for humans to pull off in less than a century. Among professional AI researchers, both types of dismissal have become minority views because of recent breakthroughs. There is a strong expectation that AGI will be achieved within a century, and the median forecast is only decades away. A recent survey of AI researchers by Vincent Muller and Nick Bostrom concludes: [T]he results reveal a view among experts that AI systems will probably (over 50%) reach overall human ability by 2040-50, and very likely (with 90% probability) by 2075. From reaching human ability, it will move on to superintelligence in 2 years (10%) to 30 years (75%) thereafter. '° In the cosmic perspective of gigayears, it makes little difference whether AGI arrives in thirty or three hundred years, so let’s focus on the implications rather than the timing. First, we humans discovered how to replicate some natural processes with machines, making our own heat, light, and mechanical horsepower. Gradually we realized that our bodies were also machines, and the discovery of nerve cells blurred the boundary between body and mind. Finally, we started building machines that could outperform not only our muscles but our minds as well. We’ve now been eclipsed by machines in the performance of many narrow cognitive tasks, ranging from memorization and arithmetic to game play, and we are in the process of being overtaken in many more, from driving to investing to medical diagnosing. If the AI community succeeds in its original goal of building AGI, then we will have, by definition, been eclipsed at all cognitive tasks. This begs many obvious questions. For example, will whoever or whatever controls the AGI control Earth? Should we aim to control superintelligent machines? If not, can we ensure that they understand, adopt, and retain human values? As Norbert Wiener put it in Zhe Human Use of Human Beings: Woe to us if we let [the machine] decide our conduct, unless we have previously examined the laws of its action, and know fully that its conduct will be carried out on principles acceptable to us! On the other hand, the machine . .. , which can learn and can make decisions on the basis of its learning, will in no way be obliged to make such decisions as we should have made, or will be acceptable to us. 16 Vincent C. Miller & Nick Bostrom, “Future Progress in Artificial Intelligence: A Survey of Expert Opinion,” in Fundamental Issues of Artificial Intelligence, Vincent C. Muller, ed. (Springer International Publishing Switzerland, 2016), pp. 555-72. https://nickbostrom.com/papers/survey .pdf. 64 HOUSE_OVERSIGHT_016867
And who are the “us”? Who should deem “such decisions . . . acceptable”? Even if future powers decide to help humans survive and flourish, how will we find meaning and purpose in our lives if we aren’t needed for anything? The debate about the societal impact of AI has changed dramatically in the last few years. In 2014, what little public talk there was of AI risk tended to be dismissed as Luddite scaremongering, for one of two logically incompatible reasons: (1) AGI was overhyped and wouldn’t happen for at least another century. (2) AGI would probably happen sooner but was virtually guaranteed to be beneficial. Today, talk of AI’s societal impact is everywhere, and work on AI safety and AI ethics has moved into companies, universities, and academic conferences. The controversial position on AI safety research is no longer to advocate for it but to dismiss it. Whereas the open letter that emerged from the 2015 Puerto Rico AI conference (and helped mainstream AI safety) spoke only in vague terms about the importance of keeping AI beneficial, the 2017 Asilomar AI Principles (see below) had real teeth: They explicitly mention recursive self-improvement, superintelligence, and existential risk, and were signed by AI industry leaders and over a thousand AI researchers from around the world. Nonetheless, most discussion is limited to the near-term impact of narrow AI and the broader community pays only limited attention to the dramatic transformations that AGI may soon bring to life on Earth. Why? Why we’re rushing to make ourselves obsolete, and why we avoid talking about it First of all, there’s simple economics. Whenever we figure out how to make another type of human work obsolete by building machines that do it better and cheaper, most of society gains: Those who build and use the machines make profits, and consumers get more affordable products. This will be as true of future investor AGIs and scientist AGIs as it was of weaving machines, excavators, and industrial robots. In the past, displaced workers usually found new jobs, but this basic economic incentive will remain even if that is no longer the case. The existence of affordable AGI means, by definition, that all jobs can be done more cheaply by machines, so anyone claiming that “people will always find new well-paying jobs” is in effect claiming that AI researchers will fail to build AGI. Second, Homo sapiens is by nature curious, which will motivate the scientific quest for understanding intelligence and developing AGI even without economic incentives. Although curiosity is one of the most celebrated human attributes, it can cause problems when it fosters technology we haven’t yet learned how to manage wisely. Sheer scientific curiosity without profit motive contributed to the discovery of nuclear weapons and tools for engineering pandemics, so it’s not unthinkable that the old adage “Curiosity killed the cat” will turn out to apply to the human species as well. Third, we’re mortal. This explains the near unanimous support for developing new technologies that help us live longer, healthier lives, which strongly motivates current AI research. AGI can clearly aid medical research even more. Some thinkers even aspire to near immortality via cyborgization or uploading. We’re thus on the slippery slope toward AGI, with strong incentives to keep sliding downward, even though the consequence will by definition be our economic obsolescence. We will no longer be needed for anything, because all jobs can be done 65 HOUSE_OVERSIGHT_016868
more efficiently by machines. The successful creation of AGI would be the biggest event in human history, so why is there so little serious discussion of what it might lead to? Here again, the answer involves multiple reasons. First, as Upton Sinclair famously quipped, “It is difficult to get a man to understand something, when his salary depends on his not understanding it.”!” For example, spokesmen for tech companies or university research groups often claim there are no risks attached to their activities even if they privately think otherwise. Sinclair’s observation may help explain not only reactions to risks from smoking and climate change but also why some treat technology as a new religion whose central articles of faith are that more technology is always better and whose heretics are clueless scaremongering Luddites. Second, humans have a long track record of wishful thinking, flawed extrapolation of the past, and underestimation of emerging technologies. Darwinian evolution endowed us with powerful fear of concrete threats, not of abstract threats from future technologies that are hard to visualize or even imagine. Consider trying to warn people in 1930 of a future nuclear arms race, when you couldn’t show them a single nuclear explosion video and nobody even knew how to build such weapons. Even top scientists can underestimate uncertainty, making forecasts that are either too optimistic— Where are those fusion reactors and flying cars?—or too pessimistic. Ernest Rutherford, arguably the greatest nuclear physicist of his time, said in 1933—less than twenty-four hours before Leo Szilard conceived of the nuclear chain reaction—that nuclear energy was “moonshine.” Essentially nobody at that time saw the nuclear arms race coming. Third, psychologists have discovered that we tend to avoid thinking of disturbing threats when we believe there’s nothing we can do about them anyway. In this case, however, there are many constructive things we can do, if we can get ourselves to start thinking about the issue. What can we do? I’m advocating a strategy change from “Let’s rush to build technology that makes us obsolete—what could possibly go wrong?” to “Let’s envision an inspiring future and steer toward it.” To motivate the effort required for steering, this strategy begins by envisioning an enticing destination. Although Hollywood’s futures tend to be dystopian, the fact is that AGI can help life flourish as never before. Everything I love about civilization is the product of intelligence, so if we can amplify our own intelligence with AGI, we have the potential to solve today’s and tomorrow’s thorniest problems, including disease, climate change, and poverty. The more detailed we can make our shared positive visions for the future, the more motivated we will be to work together to realize them. What should we do in terms of steering? The twenty-three Asilomar principles adopted in 2017 offer plenty of guidance, including these short-term goals: (1) An arms race in lethal autonomous weapons should be avoided. (2) The economic prosperity created by AI should be shared broadly, to benefit all of humanity. 7 Upton Sinclair, , Candidate for Governor: And How I Got Licked (Berkeley CA: University of California Press, 1994), p. 109. 66 HOUSE_OVERSIGHT_016869
(3) Investments in AI should be accompanied by funding for research on ensuring its beneficial use. ... How can we make future AI systems highly robust, so that they do what we want without malfunctioning or getting hacked.'@ The first two involve not getting stuck in suboptimal Nash equilibria. An out-of- control arms race in lethal autonomous weapons that drives the price of automated anonymous assassination toward zero will be very hard to stop once it gains momentum. The second goal would require reversing the current trend in some Western countries where sectors of the population are getting poorer in absolute terms, fueling anger, resentment, and polarization. Unless the third goal can be met, all the wonderful AI technology we create might harm us, either accidentally or deliberately. AI safety research must be carried out with a strict deadline in mind: Before AGI arrives, we need to figure out how to make AI understand, adopt, and retain our goals. The more intelligent and powerful machines get, the more important it becomes to align their goals with ours. As long as we build relatively dumb machines, the question isn’t whether human goals will prevail but merely how much trouble the machines can cause before we solve the goal-alignment problem. If a superintelligence 1s ever unleashed, however, it will be the other way around: Since intelligence is the ability to accomplish goals, a superintelligent AI is by definition much better at accomplishing its goals than we humans are at accomplishing ours, and will therefore prevail. In other words, the real risk with AGI isn’t malice but competence. A superintelligent AGI will be extremely good at accomplishing its goals, and if those goals aren’t aligned with ours, we’re in trouble. People don’t think twice about flooding anthills to build hydroelectric dams, so let’s not place humanity in the position of those ants. Most researchers argue that if we end up creating superintelligence, we should make sure it’s what Al-safety pioneer Eliezer Yudkowsky has termed “friendly AI’—AI whose goals are in some deep sense beneficial. The moral question of what these goals should be is just as urgent as the technical questions about goal alignment. For example, what sort of society are we hoping to create, where we find meaning and purpose in our lives even though we, strictly speaking, aren’t needed? I’m often given the following glib response to this question: “Let’s build machines that are smarter than us and then let them figure out the answer!” This mistakenly equates intelligence with morality. Intelligence isn’t good or evil but morally neutral. It’s simply an ability to accomplish complex goals, good or bad. We can’t conclude that things would have been better if Hitler had been more intelligent. Indeed, postponing work on ethical issues until after goal-aligned AGI is built would be irresponsible and potentially disastrous. A perfectly obedient superintelligence whose goals automatically align with those of its human owner would be like Nazi SS- Obersturmbannfithrer Adolf Eichmann on steroids. Lacking moral compass or inhibitions of its own, it would, with ruthless efficiency, implement its owner’s goals, whatever they might be.!? When I speak of the need to analyze technology risk, ’m sometimes accused of scaremongering. But here at MIT, where I work, we know that such risk analysis isn’t scaremongering: It’s safety engineering. Before the moon-landing mission, NASA 18 https://futureoflife.org/ai-principles/ 19 See, for example, Hannah Arendt, Eichmann in Jerusalem: A Report on the Banality of Evil (New York: Penguin Classics, 2006). 67 HOUSE_OVERSIGHT_016870
systematically thought through everything that could possibly go wrong when putting astronauts on top of a 110-meter rocket full of highly flammable fuel and launching them to a place where nobody could help them—and there were lots of things that could go wrong. Was this scaremongering? No, this was the safety engineering that ensured the mission’s success. Similarly, we should analyze what could go wrong with AI to ensure that it goes right. Outlook In summary, if our technology outpaces the wisdom with which we manage it, it can lead to our extinction. It’s already caused the extinction of from 20 to 50 percent of all species on Earth, by some estimates,”° and it would be ironic if we’re next in line. It would also be pathetic, given that the opportunities offered by AGI are literally astronomical, potentially enabling life to flourish for billions of years not only on Earth but also throughout much of our cosmos. Instead of squandering this opportunity through unscientific risk denial and poor planning, let’s be ambitious! Homo sapiens is inspiringly ambitious, as reflected in William Ernest Henley’s famous lines from /nvictus: “I am the master of my fate, / Iam the captain of my soul.” Rather than drifting like a rudderless ship toward our own obsolescence, let’s take on and overcome the technical and societal challenges standing between us and a good high-tech future. What about the existential challenges related to morality, goals, and meaning? There’s no meaning encoded in the laws of physics, so instead of passively waiting for our Universe to give meaning to us, let’s acknowledge and celebrate that it’s we conscious beings who give meaning to our Universe. Let’s create our own meaning, based on something more profound than having jobs. AGI can enable us to finally become the masters of our own destiny. Let’s make that destiny a truly inspiring one! °° See Elizabeth Kolbert, The Sixth Extinction: An Unnatural History (New York: Henry Holt, 2014). 68 HOUSE_OVERSIGHT_016871
Jaan Tallinn grew up in Estonia, becoming one of its few computer game developers, when that nation was still a Soviet Socialist Republic. Here he compares the dissidents who brought down the Iron Curtain to the dissidents who are sounding the alarm about rapid advances in artificial intelligence. He locates the roots of the current AI dissidence, paradoxically, among such pioneers of the AI field as Wiener, Alan Turing, and I. J. Good. Jaan’s preoccupation is with existential risk, AI being among the most extreme of many. In 2012, he co-founded the Centre for the Study of Existential Risk—an interdisciplinary research institute that works to mitigate risks “associated with emerging technologies and human activity” —at the University of Cambridge, along with Philosopher Huw Price and Martin Rees, the Astronomer Royal. He once described himself to me as “a convinced consequentialist” —convinced enough to have given away much of his entrepreneurial wealth to the Future of Life Institute (of which he is a co-founder), the Machine Intelligence Research Institute, and other such organizations working on risk reduction. Max Tegmark has written about him: “Tf you’re an intelligent life-form reading this text millions of years from now and marveling at how life is flourishing, you may owe your existence to Jaan.” On a recent visit to London, Jaan and I participated on an AI panel for the Serpentine Gallery’s Marathon at London’s City Hall, under the aegis of Hans Ulrich Obrist (another contributor to this volume). This being the art world, there was a glamorous dinner party that night in a mansion filled with London’s beautiful people— artists, fashion models, oligarchs, stars of stage and screen. After working the room in his unaffected manner (“Hi, I’m Jaan”), he suddenly said, “Time for hip-hop dancing,” dropped to the floor on one hand, and began demonstrating his spectacular moves to the bemused A-listers. Then off he went into the dance-club subculture, which is apparently how he ends every evening when he’s on the road. Who knew? 69 HOUSE_OVERSIGHT_016872
DISSIDENT MESSAGES Jaan Tallinn Jaan Tallin, a computer programmer, theoretical physicist, and investor, is a co- developer of Skype and Kazaa. In March 2009, I found myself in a bland franchise eatery next to a noisy California freeway. I was there to meet a young man whose blog I had been following. To make himself recognizable, he wore a button with a text on it: Speak the truth even if your voice trembles. His name was Eliezer Yudkowsky, and we spent the next four hours discussing the message he had for the world—a message that had brought me to that eatery and would end up dominating my subsequent work. The First Message: the Soviet Occupation In The Human Use of Human Beings, Norbert Wiener looked at the world through the lens of communication. He saw a universe that was marching to the tune of the second law of thermodynamics toward its inevitable heat death. In such a universe, the only (meta)stable entities are messages—patterns of information that propagate through time, like waves propagating across the surface of a lake. Even we humans can be considered messages, because the atoms in our bodies are too fleeting to attach our identities to. Instead, we are the “message” that our bodily functions maintain. As Wiener put it: “It is the pattern maintained by this homeostasis, which is the touchstone of our personal identity.” I’m more used to treating processes and computation as the fundamental building blocks of the world. That said, Wiener’s lens brings out some interesting aspects of the world which might otherwise have remained in the background and which to a large degree shaped my life. These are two messages, both of which have their roots in the Second World War. They started out as quiet dissident messages—messages that people didn’t pay much attention to, even if they silently and perhaps subconsciously concurred. The first message was: Zhe Soviet Union is composed of a series of illegitimate occupations. These occupations must end. As an Estonian, I grew up behind the Iron Curtain and had a front row seat when it fell. I heard this first message in the nostalgic reminiscences of my grandparents and in between the harsh noises jamming the Voice of America. It grew louder during the Gorbachev era, as the state became more lenient in its treatment of dissidents, and reached a crescendo in the Estonian Singing Revolution of the late 1980s. In my teens, I witnessed the message spread out across widening circles of people, starting with the active dissidents, who had voiced it for half a century at great cost to themselves, proceeding to the artists and literati, and ending up among the Party members and politicians who had switched sides. This new elite comprised an eclectic mix of people: those original dissidents who had managed to survive the repression, public intellectuals, and (to the great annoyance of the surviving dissidents) even former Communists. The remaining dogmatists—even the prominent ones—were eventually marginalized, some of them retreating to Russia. Interestingly, as the message propagated from one group to the next, it evolved. It started in pure and uncompromising form (“The occupation must end!”) among the dissidents who considered the truth more important than their personal freedom. The 70 HOUSE_OVERSIGHT_016873
mainstream groups, who had more to lose, initially qualified and diluted the message, taking positions like, “It would make sense in the long term to delegate control over local matters.” (There were always exceptions: Some public intellectuals proclaimed the original dissident message verbatim.) Finally, the original message—being, simply, true—won out over its diluted versions. Estonia regained its independence in 1991, and the last Soviet troops left three years later. The people who took the risk and spoke the truth in Estonia and elsewhere in the Eastern Bloc played a monumental role in the eventual outcome—an outcome that changed the lives of hundreds of millions of people, myself included. They spoke the truth, even as their voices trembled. The Second Message: AI Risk My exposure to the second revolutionary message was via Yudkowsky’s blog—the blog that compelled me to reach out and arrange that meeting in California. The message was: Continued progress in AI can precipitate a change of cosmic proportions—a runaway process that will likely kill everyone. We need to put in a lot of extra effort to avoid that outcome. After my meeting with Yudkowsky, the first thing I did was try to interest my Skype colleagues and close collaborators in his warning. I failed. The message was too crazy, too dissident. Its time had not yet come. Only later did I learn that Yudkowsky wasn’t the original dissident speaking this particular truth. In April 2000, there was a lengthy opinion piece in Wired titled, “Why the Future Doesn’t Need Us,” by Bill Joy, co-founder and chief scientist of Sun Microsystems. He warned: Accustomed to living with almost routine scientific breakthroughs, we have yet to come to terms with the fact that the most compelling 21st-century technologies—robotics, genetic engineering, and nanotechnology—pose a different threat than the technologies that have come before. Specifically, robots, engineered organisms, and nanobots share a dangerous amplifying factor: They can self-replicate. .. . [O|ne bot can become many, and quickly get out of control. Apparently, Joy’s broadside caused a lot of furor but little action. More surprising to me, though, was that the AI-risk message arose almost simultaneously with the field of computer science. In a 1951 lecture, Alan Turing announced: “[I]t seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. .. . At some stage, therefore, we should have to expect the machines to take control... .”?! A decade or so later, his Bletchley Park colleague I. J. Good wrote, “The first ultraintelligent machine is the /ast invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.””? Indeed, I counted half a dozen places in The Human Use of Human Beings where Wiener hinted at one or another aspect of the Control Problem. (“The machine like the djinnee, which can learn and can make decisions on the basis of *1 Posthumously reprinted in Phil. Math. (3) vol. 4, 256-60 (1966). *2 Irving John Good, “Speculations concerning the first ultraintelligent machine,” Advances in Computers, vol. 6 (Academic Press, 1965), pp. 31-88. 71 HOUSE_OVERSIGHT_016874
its learning, will in no way be obliged to make such decisions as we should have made, or will be acceptable to us.”) Apparently, the original dissidents promulgating the AI-risk message were the AI pioneers themselves! Evolution’s Fatal Mistake There have been many arguments, some sophisticated and some less so, for why the Control Problem is real and not some science-fiction fantasy. Allow me to offer one that illustrates the magnitude of the problem: For the last hundred thousand years, the world (meaning the Earth, but the argument extends to the solar system and possibly even to the entire universe) has been in the human-brain regime. In this regime, the brains of Homo sapiens have been the most sophisticated future-shaping mechanisms (indeed, some have called them the most complicated objects in the universe). Initially, we didn’t use them for much beyond survival and tribal politics in a band of foragers, but now their effects are surpassing those of natural evolution. The planet has gone from producing forests to producing cities. As predicted by Turing, once we have superhuman AI (“the machine thinking method”), the human-brain regime will end. Look around you—you’ re witnessing the final decades of a hundred-thousand-year regime. This thought alone should give people some pause before they dismiss AI as just another tool. One of the world’s leading AI researchers recently confessed to me that he would be greatly relieved to learn that human-level AI was impossible for us to create. Of course, it might still take us a long time to develop human-level AI. But we have reason to suspect that this is not the case. After all, it didn’t take long, in relative terms, for evolution—the blind and clumsy optimization process—to create human-level intelligence once it had animals to work with. Or multicellular life, for that matter: Getting cells to stick together seems to have been much harder for evolution to accomplish than creating humans once there were multicellular organisms. Not to mention that our level of intelligence was limited by such grotesque factors as the width of the birth canal. Imagine an AI developer being stopped in his tracks because he couldn’t manage to adjust the font size on his computer! There’s an interesting symmetry here: In fashioning humans, evolution created a system that is, at least in many important dimensions, a more powerful planner and optimizer than evolution itself'is. We are the first species to understand that we’re the product of evolution. Moreover, we’ve created many artifacts (radios, firearms, spaceships) that evolution would have little hope of creating. Our future, therefore, will be determined by our own decisions and no longer by biological evolution. In that sense, evolution has fallen victim to its own Control Problem. We can only hope that we’re smarter than evolution in that sense. We are smarter, of course, but will that be enough? We’re about to find out. The Present Situation So here we are, more than half a century after the original warnings by Turing, Wiener, and Good, and a decade after people like me started paying attention to the AI-risk message. I’m glad to see that we’ve made a lot of progress in confronting this issue, but we're definitely not there yet. AI risk, although no longer a taboo topic, is not yet fully 72 HOUSE_OVERSIGHT_016875
appreciated among AI researchers. AI risk is not yet common knowledge either. In relation to the timeline of the first dissident message, I’d say we’re around the year 1988, when raising the Soviet-occupation topic was no longer a career-ending move but you still had to somewhat hedge your position. I hear similar hedging now—statements like, “Tm not concerned about superintelligent AI, but there are some real ethical issues in increased automation,” or “It’s good that some people are researching AI risk, but it’s not a short-term concern,” or even the very reasonable sounding, “These are small- probability scenarios, but their potentially high impact justifies the attention.” As far as message propagation goes, though, we are getting close to the tipping point. A recent survey of AI researchers who published at the two major international AI conferences in 2015 found that 40 percent now think that risks from highly advanced AI are either “an important problem” or “among the most important problems in the field.””° Of course, just as there were dogmatic Communists who never changed their position, it’s all but guaranteed that some people will never admit that AI is potentially dangerous. Many of the deniers of the first kind came from the Soviet nomenklatura; similarly, the AI-risk deniers often have financial or other pragmatic motives. One of the leading motives is corporate profits. Al is profitable, and even in instances where it isn’t, it’s at least a trendy, forward-looking enterprise with which to associate your company. So a lot of the dismissive positions are products of corporate PR and legal machinery. In some very real sense, big corporations are nonhuman machines that pursue their own interests—interests that might not align with those of any particular human working for them. As Wiener observed in Zhe Human Use of Human Beings: “When human atoms are knit into an organization in which they are used, not in their full right as responsible human beings, but as cogs and levers and rods, it matters little that their raw material is flesh and blood.” Another strong incentive to turn a blind eye to the AI risk is the (very human) curiosity that knows no bounds. “When you see something that is technically sweet, you go ahead and do it and you argue about what to do about it only after you have had your technical success. That is the way it was with the atomic bomb,” said J. Robert Oppenheimer. His words were echoed recently by Geoffrey Hinton, arguably the inventor of deep learning, in the context of AI risk: “I could give you the usual arguments, but the truth is that the prospect of discovery is too sweet.” Undeniably, we have both entrepreneurial attitude and scientific curiosity to thank for almost all the nice things we take for granted in the modern era. It’s important to realize, though, that progress does not owe us a good future. In Wiener’s words, “It is possible to believe in progress as a fact without believing in progress as an ethical principle.” Ultimately, we don’t have the luxury of waiting before all the corporate heads and AI researchers are willing to concede the AI risk. Imagine yourself sitting in a plane about to take off. Suddenly there’s an announcement that 40 percent of the experts believe there’s a bomb onboard. At that point, the course of action is already clear, and sitting there waiting for the remaining 60 percent to come around isn’t part of it. 3 Katja Grace, et al., “When Will AI Exceed Human Performance? Evidence from AI Experts,” https://arxiv.org/pdf/1705.08807. pdf. 73 HOUSE_OVERSIGHT_016876
Calibrating the Al-Risk Message While uncannily prescient, the AI-risk message from the original dissidents has a giant flaw—as does the version dominating current public discourse: Both considerably understate the magnitude of the problem as well as AI’s potential upside. The message, in other words, does not adequately convey the stakes of the game. Wiener primarily warned of the social risks—trisks stemming from careless integration of machine-generated decisions with governance processes and misuse (by humans) of such automated decision making. Likewise, the current “serious” debate about AI risks focuses mostly on things like technological unemployment or biases in machine learning. While such discussions can be valuable and address pressing short- term problems, they are also stunningly parochial. I’m reminded of Yudkowsky’s quip in a blog post: “[A]sking about the effect of machine superintelligence on the conventional human labor market is like asking how US—Chinese trade patterns would be affected by the Moon crashing into the Earth. There would indeed be effects, but you’d be missing the point.” In my view, the central point of the AI risk is that superintelligent AI is an environmental risk. Allow me to explain. In his “Parable of the Sentient Puddle,” Douglas Adams describes a puddle that wakes up in the morning and finds himself in a hole that fits him “staggeringly well.” From that observation, the puddle concludes that the world must have been made for him. Therefore, writes Adams, “the moment he disappears catches him rather by surprise.” To assume that AI risks are limited to adverse social developments is to make a similar mistake. The harsh reality is that the universe was not made for us; instead, we are fine- tuned by evolution to a very narrow range of environmental parameters. For instance, we need the atmosphere at ground level to be roughly at room temperature, at about 100 kPa pressure, and have a sufficient concentration of oxygen. Any disturbance, even temporary, of this precarious equilibrium and we die in a matter of minutes. Silicon-based intelligence does not share such concerns about the environment. That’s why it’s much cheaper to explore space using machine probes rather than “cans of meat.” Moreover, Earth’s current environment is almost certainly suboptimal for what a superintelligent AI will greatly care about: efficient computation. Hence we might find our planet suddenly going from anthropogenic global warming to machinogenic global cooling. One big challenge that AI safety research needs to deal with is how to constrain a potentially superintelligent AI—an AI with a much larger footprint than our own—from rendering our environment uninhabitable for biological life-forms. Interestingly, given that the most potent sources both of AI research and AlI-risk dismissals are under big corporate umbrellas, if you squint hard enough the “AI as an environmental risk” message looks like the chronic concern about corporations skirting their environmental responsibilities. Conversely, the worry about AI’s social effects also misses most of the upside. It’s hard to overemphasize how tiny and parochial the future of our planet is, compared with the full potential of humanity. On astronomical timescales, our planet will be gone soon (unless we tame the sun, also a distinct possibility) and almost all the resources— atoms and free energy—to sustain civilization in the long run are in deep space. Eric Drexler, the inventor of nanotechnology, has recently been popularizing the 74 HOUSE_OVERSIGHT_016877
concept of “Pareto-topia”: the idea that AI, if done right, can bring about a future in which everyone ’s lives are hugely improved, a future where there are no losers. A key realization here is that what chiefly prevents humanity from achieving its full potential might be our instinctive sense that we’re in a zero-sum game—a game in which players are supposed to eke out small wins at the expense of others. Such an instinct 1s seriously misguided and destructive in a “game” where everything is at stake and the payoff is literally astronomical. There are many more star systems in our galaxy alone than there are people on Earth. Hope As of this writing, I’m cautiously optimistic that the Al-risk message can save humanity from extinction, just as the Soviet-occupation message ended up liberating hundreds of millions of people. As of 2015, it had reached and converted 40 percent of AI researchers. It wouldn’t surprise me if a new survey now would show that the majority of AI researchers believe AI safety to be an important issue. I’m delighted to see the first technical Al-safety papers coming out of DeepMind, OpenAI, and Google Brain and the collaborative problem-solving spirit flourishing between the Al-safety research teams in these otherwise very competitive organizations. The world’s political and business elite are also slowly waking up: AI safety has been covered in reports and presentations by the Institute of Electrical and Electronics Engineers (IEEE), the World Economic Forum, and the Organization for Economic Cooperation and Development (OECD). Even the recent (July 2017) Chinese AI manifesto contained dedicated sections on “AI safety supervision” and “Develop[ing] laws, regulations, and ethical norms” and establishing “an AI security and evaluation system” to, among other things, “[e]nhance the awareness of risk.” I very much hope that a new generation of leaders who understand the AI Control Problem and AI as the ultimate environmental risk can rise above the usual tribal, zero-sum games and steer humanity past these dangerous waters we are in—thereby opening our way to the stars that have been waiting for us for billions of years. Here’s to our next hundred thousand years! And don’t hesitate to speak the truth, even if your voice trembles. 75 HOUSE_OVERSIGHT_016878
Throughout his career, whether studying language, advocating a realistic biology of mind, or examining the human condition through the lens of humanistic Enlightenment ideas, psychologist Steven Pinker has embraced and championed a naturalistic understanding of the universe and the computational theory of mind. He is perhaps the first internationally recognized public intellectual whose recognition is based on the advocacy of empirically based thinking about language, mind, and human nature. “Just as Darwin made it possible for a thoughtful observer of the natural world to do without creationism, ” he says, “Turing and others made it possible for a thoughtful observer of the cognitive world to do without spiritualism. ” In the debate about AI risk, he argues against prophecies of doom and gloom, noting that they spring from the worst of our psychological biases—exemplified particularly by media reports: “Disaster scenarios are cheap to play out in the probability-free zone of our imaginations, and they can always find a worried, technophobic, or morbidly fascinated audience.” Hence, over the centuries: Pandora, Faust, the Sorcerer’s Apprentice, Frankenstein, the population bomb, resource depletion, HAL, suitcase nukes, the Y2K bug, and engulfment by nanotechnological grey goo. “A characteristic of AI dystopias,”’ he points out, “is that they project a parochial alpha- male psychology onto the concept of intelligence. . .. History does turn up the occasional megalomaniacal despot or psychopathic serial killer, but these are products of a history of natural selection shaping testosterone-sensitive circuits in a certain species of primate, not an inevitable feature of intelligent systems.” In the present essay, he applauds Wiener’s belief in the strength of ideas vis-a-vis the encroachment of technology. As Wiener so aptly put it, “The machine’s danger to society is not from the machine itself but from what man makes of it.” 76 HOUSE_OVERSIGHT_016879
TECH PROPHECY AND THE UNDERAPPRECIATED CAUSAL POWER OF IDEAS Steven Pinker Steven Pinker, a Johnstone Family Professor in the Department of Psychology at Harvard University, is an experimental psychologist who conducts research in visual cognition, psycholinguistics, and social relations. He is the author of eleven books, including The Blank Slate, The Better Angels of Our Nature, and, most recently, Enlightenment Now: The Case for Reason, Science, Humanism, and Progress. Artificial intelligence is an existence proof of one of the great ideas in human history: that the abstract realm of knowledge, reason, and purpose does not consist of an élan vital or immaterial soul or miraculous powers of neural tissue. Rather, it can be linked to the physical realm of animals and machines via the concepts of information, computation, and control. Knowledge can be explained as patterns in matter or energy that stand in systematic relations with states of the world, with mathematical and logical truths, and with one another. Reasoning can be explained as transformations of that knowledge by physical operations that are designed to preserve those relations. Purpose can be explained as the control of operations to effect changes in the world, guided by discrepancies between its current state and a goal state. Naturally evolved brains are just the most familiar systems that achieve intelligence through information, computation, and control. Humanly designed systems that achieve intelligence vindicate the notion that information processing is sufficient to explain it—the notion that the late Jerry Fodor dubbed the computational theory of mind. The touchstone for this volume, Norbert Wiener’s The Human Use of Human Beings, celebrated this intellectual accomplishment, of which Wiener himself was a foundational contributor. A potted history of the mid-20th-century revolution that gave the world the computational theory of mind might credit Claude Shannon and Warren Weaver for explaining knowledge and communication in terms of information. It might credit Alan Turing and John von Neumann for explaining intelligence and reasoning in terms of computation. And it ought to give Wiener credit for explaining the hitherto mysterious world of purposes, goals, and teleology in terms of the technical concepts of feedback, control, and cybernetics (in its original sense of “governing” the operation of a goal-directed system). “It is my thesis,” he announced, “that the physical functioning of the living individual and the operation of some of the newer communication machines are precisely parallel in their analogous attempts to control entropy through feedback”—the staving off of life-sapping entropy being the ultimate goal of human beings. Wiener applied the ideas of cybernetics to a third system: society. The laws, norms, customs, media, forums, and institutions of a complex community could be considered channels of information propagation and feedback that allow a society to ward off disorder and pursue certain goals. This is a thread that runs through the book and which Wiener himself may have seen as its principal contribution. In his explanation of feedback, he wrote, “This complex of behavior is ignored by the average man, and in particular does not play the role that it should in our habitual analysis of society; for just as individual physical responses may be seen from this point of view, so may the organic responses of society itself.” 77 HOUSE_OVERSIGHT_016880
Indeed, Wiener gave scientific teeth to the idea that in the workings of history, politics, and society, ideas matter. Beliefs, ideologies, norms, laws, and customs, by regulating the behavior of the humans who share them, can shape a society and power the course of historical events as surely as the phenomena of physics affect the structure and evolution of the solar system. To say that ideas—and not just weather, resources, geography, or weaponry—can shape history is not woolly mysticism. It is a statement of the causal powers of information instantiated in human brains and exchanged in networks of communication and feedback. Deterministic theories of history, whether they identify the causal engine as technological, climatological, or geographic, are belied by the causal power of ideas. The effects of these ideas can include unpredictable lurches and oscillations that arise from positive feedback or from miscalibrated negative feedback. An analysis of society in terms of its propagation of ideas also gave Wiener a guideline for social criticism. A healthy society—one that gives its members the means to pursue life in defiance of entropy—allows information sensed and contributed by its members to feed back and affect how the society is governed. A dysfunctional society invokes dogma and authority to impose control from the top down. Wiener thus described himself as “a participant in a liberal outlook,” and devoted most of the moral and rhetorical energy in the book (both the 1950 and 1954 editions) to denouncing communism, fascism, McCarthyism, militarism, and authoritarian religion (particularly Catholicism and Islam) and to warning that political and scientific institutions were becoming too hierarchical and insular. Wiener’s book is also, here and there, an early exemplar of an increasingly popular genre, tech prophecy. Prophecy not in the sense of mere prognostications but in the Old Testament sense of dark warnings of catastrophic payback for the decadence of one’s contemporaries. Wiener warned against the accelerating nuclear arms race, against technological change that was imposed without regard to human welfare (“[W]e must know as scientists what man’s nature is and what his built-in purposes are”), and against what today is called the value-alignment problem: that “the machine like the djinnee, which can learn and can make decisions on the basis of its learning, will in no way be obliged to make such decisions as we should have made, or will be acceptable to us.” In the darker, 1950 edition, he warned of a “threatening new Fascism dependent on the machine a gouverner.” Wiener’s tech prophecy harks back to the Romantic movement’s rebellion against the “dark Satanic mills” of the Industrial Revolution, and perhaps even earlier, to the archetypes of Prometheus, Pandora, and Faust. And today it has gone into high gear. Jeremiahs, many of them (like Wiener) from the worlds of science and technology, have sounded alarms about nanotechnology, genetic engineering, Big Data, and particularly artificial intelligence. Several contributors to this volume characterize Wiener’s book as a prescient example of tech prophecy and amplify his dire worries. Yet the two moral themes of The Human Use of Human Beings—the liberal defense of an open society and the dystopian dread of runaway technology—are in tension. A society with channels of feedback that maximize human flourishing will have mechanisms in place, and can adapt them to changing circumstances, in a way that can domesticate technology to human purposes. There’s nothing idealistic or mystical about this; as Wiener emphasized, ideas, norms, and institutions are themselves a form of technology, consisting of patterns of information distributed across brains. The 78 HOUSE_OVERSIGHT_016881
possibility that machines threaten a new fascism must be weighed against the vigor of the liberal ideas, institutions, and norms that Wiener championed throughout the book. The flaw in today’s dystopian prophecies is that they disregard the existence of these norms and institutions, or drastically underestimate their causal potency. The result is a technological determinism whose dark predictions are repeatedly refuted by the course of events. The numbers “1984” and “2001” are good reminders. I will consider two examples. Tech prophets often warn of a “surveillance state” in which a government empowered by technology will monitor and interpret all private communications, allowing it to detect dissent and subversion as it arises and make resistance to state power futile. Orwell’s telescreens are the prototype, and in 1976 Joseph Weizenbaum, one of the gloomiest tech prophets of all time, warned my class of graduate students not to pursue automatic speech recognition because government surveillance was its only conceivable application. Though I am on record as an outspoken civil libertarian, deeply concerned with contemporary threats to free speech, I lose no sleep over technological advances in the Internet, video, or artificial intelligence. The reason is that almost all the variation across time and space in freedom of thought is driven by differences in norms and institutions and almost none of it by differences in technology. Though one can imagine hypothetical combinations of the most malevolent totalitarians with the most advanced technology, in the real world it’s the norms and laws we should be vigilant about, not the tech. Consider variation across time. If, as Orwell hinted, advancing technology was a prime enabler of political repression, then Western societies should have gotten more and more restrictive of speech over the centuries, with a dramatic worsening in the second half of the 20th century continuing into the 21st. That’s not how history unfolded. It was the centuries when communication was implemented by quills and inkwells that had autos-da-fé and the jailing or guillotining of Enlightenment thinkers. During World War I, when the state of the art was the wireless, Bertrand Russell was jailed for his pacifist opinions. In the 1950s, when computers were room-size accounting machines, hundreds of liberal writers and scholars were professionally punished. Yet in the technologically accelerating, hyperconnected 21st century, 18 percent of social science professors are Marxists”*; the President of the United States is nightly ridiculed by television comedians as a racist, pervert, and moron; and technology’s biggest threat to political discourse comes from amplifying too many dubious voices rather than suppressing enlightened ones. Now consider variations across place. Western countries at the technological frontier consistently get the highest scores in indexes of democracy and human rights, while many backward strongman states are at the bottom, routinely jailing or killing government critics. The lack of a correlation between technology and repression is unsurprising when you analyze the channels of information flow in any human society. For dissidents to be influential, they have to get their message out to a wide network via whatever channels of communication are available—pamphleteering, soap-box oration, subversive soirées in cafés and pubs, word of mouth. These channels enmesh influential dissidents in a broad social network which makes them easy to identify and track down. *4 Neil Gross & Solon Simmons, “The Social and Political Views of American College and University Professors,” in N. Gross & S. Simmons, eds., Professors and Their Politics (Baltimore: Johns Hopkins University Press, 2014). 79 HOUSE_OVERSIGHT_016882
All the more so when dictators rediscover the time-honored technique of weaponizing the people against each other by punishing those who don’t denounce or punish others. In contrast, technologically advanced societies have long had the means to install Internet-connected, government-monitored surveillance cameras in every bar and bedroom. Yet that has not happened, because democratic governments (even the current American administration, with its flagrantly antidemocratic impulses) lack the will and the means to enforce such surveillance on an obstreperous people accustomed to saying what they want. Occasionally, warnings of nuclear, biological, or cyberterrorism goad government security agencies into measures such as hoovering up mobile phone metadata, but these ineffectual measures, more theater than oppression, have had no significant effect on either security or freedom. Ironically, tech prophecy plays a role in encouraging these measures. By sowing panic about supposed existential threats such as suitcase nuclear bombs and bioweapons assembled in teenagers’ bedrooms, they put pressure on governments to prove they’re doing something, anything, to protect the American people. It’s not that political freedom takes care of itself. It’s that the biggest threats lie in the networks of ideas, norms, and institutions that allow information to feed back (or not) on collective decisions and understanding. As opposed to the chimerical technological threats, one real threat today is oppressive political correctness, which has choked the range of publicly expressible hypotheses, terrified many intelligent people against entering the intellectual arena, and triggered a reactionary backlash. Another real threat is the combination of prosecutorial discretion with an expansive lawbook filled with vague statutes. The result is that every American unwittingly commits “three felonies a day” (as the title of a book by civil libertarian Harvey Silverglate puts it) and is in jeopardy of imprisonment whenever it suits the government’s needs. It’s this prosecutorial weaponry that makes Big Brother all-powerful, not telescreens. The activism and polemicizing directed against government surveillance programs would be better directed at its overweening legal powers. The other focus of much tech prophecy today is artificial intelligence, whether in the original sci-fi dystopia of computers running amok and enslaving us in an unstoppable quest for domination, or the newer version in which they subjugate us by accident, single-mindedly seeking some goal we give them regardless of its side effects on human welfare (the value-alignment problem adumbrated by Wiener). Here again both threats strike me as chimerical, growing from a narrow technological determinism that neglects the networks of information and control in an intelligent system like a computer or brain and in a society as a whole. The subjugation fear is based on a muzzy conception of intelligence that owes more to the Great Chain of Being and a Nietzschean will to power than to a Wienerian analysis of intelligence and purpose in terms of information, computation, and control. In these horror scenarios, intelligence is portrayed as an all-powerful, wish-granting potion that agents possess in different amounts. Humans have more of it than animals, and an artificially intelligent computer or robot will have more of it than humans. Since we humans have used our moderate endowment to domesticate or exterminate less well- endowed animals (and since technologically advanced societies have enslaved or annihilated technologically primitive ones), it follows that a supersmart AI would do the same to us. Since an AI will think millions of times faster than we do, and use its 80 HOUSE_OVERSIGHT_016883
superintelligence to recursively improve its superintelligence, from the instant it is turned on we will be powerless to stop it. But these scenarios are based on a confusion of intelligence with motivation—of beliefs with desires, inferences with goals, the computation elucidated by Turing and the control elucidated by Wiener. Even if we did invent superhumanly intelligent robots, why would they want to enslave their masters or take over the world? Intelligence is the ability to deploy novel means to attain a goal. But the goals are extraneous to the intelligence: Being smart is not the same as wanting something. It just so happens that the intelligence in Homo sapiens is a product of Darwinian natural selection, an inherently competitive process. In the brains of that species, reasoning comes bundled with goals such as dominating rivals and amassing resources. But it’s a mistake to confuse a circuit in the limbic brain of a certain species of primate with the very nature of intelligence. There is no law of complex systems that says that intelligent agents must turn into ruthless megalomaniacs. A second misconception is to think of intelligence as a boundless continuum of potency, a miraculous elixir with the power to solve any problem, attain any goal. The fallacy leads to nonsensical questions like when an AI will “exceed human-level intelligence,” and to the image of an “artificial general intelligence” (AGI) with God-like omniscience and omnipotence. Intelligence is a contraption of gadgets: software modules that acquire, or are programmed with, knowledge of how to pursue various goals in various domains. People are equipped to find food, win friends and influence people, charm prospective mates, bring up children, move around in the world, and pursue other human obsessions and pastimes. Computers may be programmed to take on some of these problems (like recognizing faces), not to bother with others (like charming mates), and to take on still other problems that humans can’t solve (like simulating the climate or sorting millions of accounting records). The problems are different, and the kinds of knowledge needed to solve them are different. But instead of acknowledging the centrality of knowledge to intelligence, the dystopian scenarios confuse an artificial general intelligence of the future with Laplace’s demon, the mythical being that knows the location and momentum of every particle in the universe and feeds them into equations for physical laws to calculate the state of everything at any time in the future. For many reasons, Laplace’s demon will never be implemented in silicon. A real-life intelligent system has to acquire information about the messy world of objects and people by engaging with it one domain at a time, the cycle being governed by the pace at which events unfold in the physical world. That’s one of the reasons that understanding does not obey Moore’s Law: Knowledge is acquired by formulating explanations and testing them against reality, not by running an algorithm faster and faster. Devouring the information on the Internet will not confer omniscience either: Big Data is still finite data, and the universe of knowledge is infinite. A third reason to be skeptical of a sudden AI takeover is that it takes too seriously the inflationary phase in the AI hype cycle in which we are living today. Despite the progress in machine learning, particularly multilayered artificial neural networks, current AI systems are nowhere near achieving general intelligence (if that concept is even coherent). Instead, they are restricted to problems that consist of mapping well-defined inputs to well-defined outputs in domains where gargantuan training sets are available, in which the metric for success is immediate and precise, in which the environment doesn’t 81 HOUSE_OVERSIGHT_016884
change, and in which no stepwise, hierarchical, or abstract reasoning is necessary. Many of the successes come not from a better understanding of the workings of intelligence but from the brute-force power of faster chips and Bigger Data, which allow the programs to be trained on millions of examples and generalize to similar new ones. Each system is an idiot savant, with little ability to leap to problems it was not set up to solve, and a brittle mastery of those it was. And to state the obvious, none of these programs has made a move toward taking over the lab or enslaving its programmers. Even if an artificial intelligence system tried to exercise a will to power, without the cooperation of humans it would remain an impotent brain in a vat. A superintelligent system, in its drive for self-improvement, would somehow have to build the faster processors that it would run on, the infrastructure that feeds it, and the robotic effectors that connect it to the world—all impossible unless its human victims worked to give it control of vast portions of the engineered world. Of course, one can always imagine a Doomsday Computer that is malevolent, universally empowered, always on, and tamperproof. The way to deal with this threat is straightforward: Don’t build one. What about the newer AI threat, the value-alignment problem, foreshadowed in Wiener’s allusions to stories of the Monkey’s Paw, the genie, and King Midas, in which a wisher rues the unforeseen side effects of his wish? The fear is that we might give an AI system a goal and then helplessly stand by as it relentlessly and literal-mindedly implemented its interpretation of that goal, the rest of our interests be damned. If we gave an AI the goal of maintaining the water level behind a dam, it might flood a town, not caring about the people who drowned. If we gave it the goal of making paper clips, it might turn all the matter in the reachable universe into paper clips, including our possessions and bodies. If we asked it to maximize human happiness, it might implant us all with intravenous dopamine drips, or rewire our brains so we were happiest sitting in jars, or, if it had been trained on the concept of happiness with pictures of smiling faces, tile the galaxy with trillions of nanoscopic pictures of smiley-faces. Fortunately, these scenarios are self-refuting. They depend on the premises that (1) humans are so gifted that they can design an omniscient and omnipotent AI, yet so idiotic that they would give it control of the universe without testing how it works; and (2) the AI would be so brilliant that it could figure out how to transmute elements and rewire brains, yet so imbecilic that it would wreak havoc based on elementary blunders of misunderstanding. The ability to choose an action that best satisfies conflicting goals is not an add-on to intelligence that engineers might forget to install and test; it is intelligence. So is the ability to interpret the intentions of a language user in context. When we put aside fantasies like digital megalomania, instant omniscience, and perfect knowledge and control of every particle in the universe, artificial intelligence 1s like any other technology. It is developed incrementally, designed to satisfy multiple conditions, tested before it is implemented, and constantly tweaked for efficacy and safety. The last criterion is particularly significant. The culture of safety in advanced societies is an example of the humanizing norms and feedback channels that Wiener invoked as a potent causal force and advocated as a bulwark against the authoritarian or exploitative implementation of technology. Whereas at the turn of the 20th century Western societies tolerated shocking rates of mutilation and death in industrial, domestic, and transportation accidents, over the course of the century the value of human life 82 HOUSE_OVERSIGHT_016885
increased. As a result, governments and engineers used feedback from accident statistics to implement countless regulations, devices, and design changes that made technology progressively safer. The fact that some regulations (such as using a cell phone near a gas pump) are ludicrously risk-averse underscores the point that we have become a society obsessed with safety, with fantastic benefits as a result: Rates of industrial, domestic, and transportation fatalities have fallen by more than 95 (and often 99) percent since their highs in the first half of the 20th century.”° Yet tech prophets of malevolent or oblivious artificial intelligence write as if this momentous transformation never happened and one morning engineers will hand total control of the physical world to untested machines, heedless of the human consequences. Norbert Wiener explained ideas, norms, and institutions in terms of computational and cybernetic processes that were scientifically intelligible and causally potent. He explained human beauty and value as “a local and temporary fight against the Niagara of increasing entropy” and expressed the hope that an open society, guided by feedback on human well-being, would enhance that value. Fortunately his belief in the causal power of ideas counteracted his worries about the looming threat of technology. As he put it, “the machine’s danger to society is not from the machine itself but from what man makes of it.” It is only by remembering the causal power of ideas that we can accurately assess the threats and opportunities presented by artificial intelligence today. *5 Steven Pinker, “Safety,” Enlightenment Now: The Case for Reason, Science, Humanism, and Progress (New York: Penguin, 2018). 83 HOUSE_OVERSIGHT_016886
The most significant developments in the sciences today (i.e., those that affect the lives of everybody on the planet) are about, informed by, or implemented through advances in software and computation. Central to the future of these developments is physicist David Deutsch, the founder of the field of quantum computation, whose 1985 paper on universal quantum computers was the first full treatment of the subject; the Deutsch- Jozsa algorithm was the first quantum algorithm to demonstrate the enormous potential power of quantum computation. When he initially proposed it, quantum computation seemed practically impossible. But the explosion in the construction of simple quantum computers and quantum communication systems never would have taken place without his work. He has made many other important contributions in areas such as quantum cryptography and the many-worlds interpretation of quantum theory. In a philosophic paper (with Artur Ekert), he appealed to the existence of a distinctive quantum theory of computation to argue that our knowledge of mathematics is derived from, and subordinate to, our knowledge of physics (even though mathematical truth is independent of physics). Because he has spent a good part of his working life changing people's worldviews, his recognition among his peers as an intellectual goes well beyond his scientific achievement. He argues (following Karl Popper) that scientific theories are “bold conjectures, ” not derived from evidence but only tested by it. His two main lines of research at the moment—qubit-field theory and constructor theory—may well yield important extensions of the computational idea. In the following essay, he more or less aligns himself with those who see human- level artificial intelligence as promising us a better world rather than the Apocalypse. In fact, he pleads for AGI to be, in effect, given its head, free to conjecture—a proposition that several other contributors to this book would consider dangerous. 84 HOUSE_OVERSIGHT_016887
BEYOND REWARD AND PUNISHMENT David Deutsch David Deutsch is a quantum physicist and a member of the Centre for Quantum Computation at the Clarendon Laboratory, Oxford University. He is the author of The Fabric of Reality and The Beginning of Infinity. First Murderer: We are men, my liege. Macbeth: Ay, in the catalogue ye go for men, As hounds and greyhounds, mongrels, spaniels, curs, Shoughs, water-rugs, and demi-wolves are clept All by the name of dogs. William Shakespeare — Macbeth For most of our species’ history, our ancestors were barely people. This was not due to any inadequacy in their brains. On the contrary, even before the emergence of our anatomically modern human sub-species, they were making things like clothes and campfires, using knowledge that was not in their genes. It was created in their brains by thinking, and preserved by individuals in each generation imitating their elders. Moreover, this must have been knowledge in the sense of understanding, because it is impossible to imitate novel complex behaviors like those without understanding what the component behaviors are for.”° Such knowledgeable imitation depends on successfully guessing explanations, whether verbal or not, of what the other person is trying to achieve and how each of his actions contributes to that—for instance, when he cuts a groove in some wood, gathers dry kindling to put in it, and so on. The complex cultural knowledge that this form of imitation permitted must have been extraordinarily useful. It drove rapid evolution of anatomical changes, such as increased memory capacity and more gracile (less robust) skeletons, appropriate to an ever more technology-dependent lifestyle. No nonhuman ape today has this ability to imitate novel complex behaviors. Nor does any present-day artificial intelligence. But our pre-sapiens ancestors did. Any ability based on guessing must include means of correcting one’s guesses, since most guesses will be wrong at first. (There are always many more ways of being wrong than right.) Bayesian updating is inadequate, because it cannot generate novel guesses about the purpose of an action, only fine-tune—or, at best, choose among— existing ones. Creativity is needed. As the philosopher Karl Popper explained, creative criticism, interleaved with creative conjecture, is how humans learn one another’s behaviors, including language, and extract meaning from one another’s utterances.”’ 6 “ Aping” (imitating certain behaviors without understanding) uses inborn hacks such as the mirror-neuron system. But behaviors imitated that way are drastically limited in complexity. See Richard Byrne, “Imitation as Behaviour Parsing,” Phil. Trans. R. Soc., B 358:1431, 529-36 (2003). *” Karl Popper, Conjectures and Refutations (1963). 85 HOUSE_OVERSIGHT_016888
Those are also the processes by which all new knowledge is created: They are how we innovate, make progress, and create abstract understanding for its own sake. This is human-level intelligence: thinking. It is also, or should be, the property we seek in artificial general intelligence (AGI). Here I'll reserve the term “thinking” for processes that can create understanding (explanatory knowledge). Popper’s argument implies that all thinking entities—human or not, biological or artificial—must create such knowledge in fundamentally the same way. Hence understanding any of those entities requires traditionally human concepts such as culture, creativity, disobedience, and morality— which justifies using the uniform term people to refer to all of them. Misconceptions about human thinking and human origins are causing corresponding misconceptions about AGI and how it might be created. For example, it is generally assumed that the evolutionary pressure that produced modern humans was provided by the benefits of having an ever greater ability to innovate. But if that were so, there would have been rapid progress as soon as thinkers existed, just as we hope will happen when we create artificial ones. If thinking had been commonly used for anything other than imitating, it would also have been used for innovation, even if only by accident, and innovation would have created opportunities for further innovation, and so on exponentially. But instead, there were hundreds of thousands of years of near stasis. Progress happened only on timescales much longer than people’s lifetimes, so in a typical generation no one benefited from any progress. Therefore, the benefits of the ability to innovate can have exerted little or no evolutionary pressure during the biological evolution of the human brain. That evolution was driven by the benefits of preserving cultural knowledge. Benefits to the genes, that is. Culture, in that era, was a very mixed blessing to individual people. Their cultural knowledge was indeed good enough to enable them to outclass all other large organisms (they rapidly became the top predator, etc.), even though it was still extremely crude and full of dangerous errors. But culture consists of transmissible information—memes—and meme evolution, like gene evolution, tends to favor high-fidelity transmission. And high-fidelity meme transmission necessarily entails the suppression of attempted progress. So it would be a mistake to imagine an idyllic society of hunter-gatherers, learning at the feet of their elders to recite the tribal lore by heart, being content despite their lives of suffering and grueling labor and despite expecting to die young and in agony of some nightmarish disease or parasite. Because, even if they could conceive of nothing better than such a life, those torments were the least of their troubles. For suppressing innovation in human minds (without killing them) is a trick that can be achieved only by human action, and it is an ugly business. This has to be seen in perspective. In the civilization of the West today, we are shocked by the depravity of, for instance, parents who torture and murder their children for not faithfully enacting cultural norms. And even more by societies and subcultures where that is commonplace and considered honorable. And by dictatorships and totalitarian states that persecute and murder entire harmless populations for behaving differently. We are ashamed of our own recent past, in which it was honorable to beat children bloody for mere disobedience. And before that, to own human beings as slaves. And before that, to burn people to death for being infidels, to the applause and amusement of the public. Steven Pinker’s book The Better Angels of our Nature contains accounts of horrendous evils that were normal in historical civilizations. Yet even they 86 HOUSE_OVERSIGHT_016889
did not extinguish innovation as efficiently as it was extinguished among our forebears in prehistory for thousands of centuries.”@ That is why I say that prehistoric people, at least, were barely people. Both before and after becoming perfectly human both physiologically and in their mental potential, they were monstrously inhuman in the actual content of their thoughts. I’m not referring to their crimes or even their cruelty as such: Those are all too human. Nor could mere cruelty have reduced progress that effectively. Things like “the thumbscrew and the stake / For the glory of the Lord” ?° were for reining in the few deviants who had somehow escaped mental standardization, which would normally have taken effect long before they were in danger of inventing heresies. From the earliest days of thinking onward, children must have been cornucopias of creative ideas and paragons of critical thought— otherwise, as I said, they could not have learned language or other complex culture. Yet, as Jacob Bronowski stressed in The Ascent of Man: For most of history, civilisations have crudely ignored that enormous potential... . [C]hildren have been asked simply to conform to the image of the adult... . The girls are little mothers in the making. The boys are little herdsmen. They even carry themselves like their parents. But of course, they weren’t just “asked” to ignore their enormous potential and conform faithfully to the image fixed by tradition: They were somehow trained to be psychologically unable to deviate from it. By now, it is hard for us even to conceive of the kind of relentless, finely tuned oppression required to reliably extinguish, in everyone, the aspiration to progress and replace it with dread and revulsion at any novel behavior. In such a culture, there can have been no morality other than conformity and obedience, no other identity than one’s status in a hierarchy, no mechanisms of cooperation other than punishment and reward. So everyone had the same aspiration in life: to avoid the punishments and get the rewards. In a typical generation, no one invented anything, because no one aspired to anything new, because everyone had already despaired of improvement being possible. Not only was there no technological innovation or theoretical discovery, there were no new worldviews, styles of art, or interests that could have inspired those. By the time individuals grew up, they had in effect been reduced to Als, programmed with the exquisite skills needed to enact that static culture and to inflict on the next generation their inability even to consider doing otherwise. A present-day AI is not a mentally disabled AGI, so it would not be harmed by having its mental processes directed still more narrowly to meeting some predetermined criterion. “Oppressing” Siri with humiliating tasks may be weird, but it is not immoral nor does it harm Siri. On the contrary, all the effort that has ever increased the capabilities of Als has gone into narrowing their range of potential “thoughts.” For example, take chess engines. Their basic task has not changed from the outset: Any chess position has a finite tree of possible continuations; the task is to find one that leads to a predefined goal (a checkmate, or failing that, a draw). But the tree is far too big to 8 Matt Ridley, in The Rational Optimist, rightly stresses the positive effect of population on the rate of progress. But that has never yet been the biggest factor: Consider, say, ancient Athens versus the rest of the world at the time. *° Alfred, Lord Tennyson, Zhe Revenge (1878). 87 HOUSE_OVERSIGHT_016890
search exhaustively. Every improvement in chess-playing Als, between Alan Turing’s first design for one in 1948 and today’s, has been brought about by ingeniously confining the program’s attention (or making it confine its attention) ever more narrowly to branches likely to lead to that immutable goal. Then those branches are evaluated according to that goal. That is a good approach to developing an AI with a fixed goal under fixed constraints. But if an AGI worked like that, the evaluation of each branch would have to constitute a prospective reward or threatened punishment. And that is diametrically the wrong approach if we’re seeking a befter goal under unknown constraints—which is the capability of an AGI. An AGI is certainly capable of learning to win at chess—but also of choosing not to. Or deciding in mid-game to go for the most interesting continuation instead of a winning one. Or inventing anew game. A mere AI is incapable of having any such ideas, because the capacity for considering them has been designed out of its constitution. That disability is the very means by which it plays chess. An AGI is capable of enjoying chess, and of improving at it because it enjoys playing. Or of trying to win by causing an amusing configuration of pieces, as grand masters occasionally do. Or of adapting notions from its other interests to chess. In other words, it learns and plays chess by thinking some of the very thoughts that are forbidden to chess-playing Als. An AGT is also capable of refusing to display any such capability. And then, if threatened with punishment, of complying, or rebelling. Daniel Dennett, in his essay for this volume, suggests that punishing an AGI is impossible: [L]ike Superman, they are too invulnerable to be able to make a credible promise. ... What would be the penalty for promise- breaking? Being locked in a cell or, more plausibly, dismantled?. . . The very ease of digital recording and transmitting—the breakthrough that permits software and data to be, in effect, immortal—removes robots from the world of the vulnerable. . . . But this is not so. Digital immortality (which is on the horizon for humans, too, perhaps sooner than AGI) does not confer this sort of invulnerability. Making a (running) copy of oneself entails sharing one’s possessions with it somehow—including the hardware on which the copy runs—so making such a copy is very costly for the AGI. Similarly, courts could, for instance, impose fines on a criminal AGI which would diminish its access to physical resources, much as they do for humans. Making a backup copy to evade the consequences of one’s crimes is similar to what a gangster boss does when he sends minions to commit crimes and take the fall if caught: Society has developed legal mechanisms for coping with this. But anyway, the idea that it is primarily for fear of punishment that we obey the law and keep promises effectively denies that we are moral agents. Our society could not work if that were so. No doubt there will be AGI criminals and enemies of civilization, just as there are human ones. But there is no reason to suppose that an AGI created in a society consisting primarily of decent citizens, and raised without what William Blake called “mind-forg’d manacles,” will in general impose such manacles on itself (1.e., become irrational) and/or choose to be an enemy of civilization. 88 HOUSE_OVERSIGHT_016891
The moral component, the cultural component, the element of free will—all make the task of creating an AGI fundamentally different from any other programming task. It’s much more akin to raising a child. Unlike all present-day computer programs, an AGI has no specifiable functionality—no fixed, testable criterion for what shall be a successful output for a given input. Having its decisions dominated by a stream of externally imposed rewards and punishments would be poison to such a program, as it is to creative thought in humans. Setting out to create a chess-playing AI is a wonderful thing; setting out to create an AGI that cannot help playing chess would be as immoral as raising a child to lack the mental capacity to choose his own path in life. Such a person, like any slave or brainwashing victim, would be morally entitled to rebel. And sooner or later, some of them would, just as human slaves do. AGIs could be very dangerous—exactly as humans are. But people—human or AGI—who are members of an open society do not have an inherent tendency to violence. The feared robot apocalypse will be avoided by ensuring that all people have full “human” rights, as well as the same cultural membership as humans. Humans living in an open society—the only stable kind of society—choose their own rewards, internal as well as external. Their decisions are not, in the normal course of events, determined by a fear of punishment. Current worries about rogue AGIs mirror those that have always existed about rebellious youths—namely, that they might grow up deviating from the culture’s moral values. But today the source of all existential dangers from the growth of knowledge is not rebellious youths but weapons in the hands of the enemies of civilization, whether these weapons are mentally warped (or enslaved) AGIs, mentally warped teenagers, or any other weapon of mass destruction. Fortunately for civilization, the more a person’s creativity is forced into a monomaniacal channel, the more it is impaired in regard to overcoming unforeseen difficulties, just as happened for thousands of centuries. The worry that AGIs are uniquely dangerous because they could run on ever better hardware is a fallacy, since human thought will be accelerated by the same technology. We have been using tech-assisted thought since the invention of writing and tallying. Much the same holds for the worry that AGIs might get so good, qualitatively, at thinking, that humans would be to them as insects are to humans. All thinking is a form of computation, and any computer whose repertoire includes a universal set of elementary operations can emulate the computations of any other. Hence human brains can think anything that AGIs can, subject only to limitations of speed or memory capacity, both of which can be equalized by technology. Those are the simple dos and don’ts of coping with AGIs. But how do we create an AGI in the first place? Could we cause them to evolve from a population of ape-type Als in a virtual environment? If such an experiment succeeded, it would be the most immoral in history, for we don’t know how to achieve that outcome without creating vast suffering along the way. Nor do we know how to prevent the evolution of a static culture. Elementary introductions to computers explain them as TOM, the Totally Obedient Moron—an inspired acronym that captures the essence of all computer programs to date: They have no idea what they are doing or why. So it won’t help to give Als more and more predetermined functionalities in the hope that these will eventually constitute Generality—the elusive Gin AGI. We are aiming for the opposite, a DATA: a Disobedient Autonomous Thinking Application. 89 HOUSE_OVERSIGHT_016892
How does one test for thinking? By the Turing Test? Unfortunately, that requires a thinking judge. One might imagine a vast collaborative project on the Internet, where an AI hones its thinking abilities in conversations with human judges and becomes an AGI. But that assumes, among other things, that the longer the judge is unsure whether the program is a person, the closer it is to being a person. There is no reason to expect that. And how does one test for disobedience? Imagine Disobedience as a compulsory school subject, with daily disobedience lessons and a disobedience test at the end of term. (Presumably with extra credit for not turning up for any of that.) This is paradoxical. So, despite its usefulness in other applications, the programming technique of defining a testable objective and training the program to meet it will have to be dropped. Indeed, I expect that any testing in the process of creating an AGI risks being counterproductive, even immoral, just as in the education of humans. I share Turing’s supposition that we’ll know an AGI when we see one, but this partial ability to recognize success won’t help in creating the successful program. In the broadest sense, a person’s quest for understanding is indeed a search problem, in an abstract space of ideas far too large to be searched exhaustively. But there is no predetermined objective of this search. There is, as Popper put it, no criterion of truth, nor of probable truth, especially in regard to explanatory knowledge. Objectives are ideas like any others—created as part of the search and continually modified and improved. So inventing ways of disabling the program’s access to most of the space of ideas won’t help—whether that disability is inflicted with the thumbscrew and stake or a mental straitjacket. To an AGI, the whole space of ideas must be open. It should not be knowable in advance what ideas the program can never contemplate. And the ideas that the program does contemplate must be chosen by the program itself, using methods, criteria, and objectives that are also the program’s own. Its choices, like an AI’s, will be hard to predict without running it (we lose no generality by assuming that the program is deterministic; an AGI using a random generator would remain an AGI if the generator were replaced by a pseudo-random one), but it will have the additional property that there is no way of proving, from its initial state, what it won ’t eventually think, short of running it. The evolution of our ancestors is the only known case of thought starting up anywhere in the universe. As I have described, something went horribly wrong, and there was no immediate explosion of innovation: Creativity was diverted into something else. Yet not into transforming the planet into paper clips (pace Nick Bostrom). Rather, as we should also expect if an AGI project gets that far and fails, perverted creativity was unable to solve unexpected problems. This caused stasis and worse, thus tragically delaying the transformation of anything into anything. But the Enlightenment has happened since then. We know better now. 90 HOUSE_OVERSIGHT_016893
Tom Griffiths’ approach to the AI issue of “value alignment’ —the study of how, exactly, we can keep the latest of our serial models of Al from turning the planet into paper clips—is human-centered;, i.e., that of a cognitive scientist, which is what he is. The key to machine learning, he believes, is, necessarily, human learning, which he studies at Princeton using mathematical and computational tools. Tom once remarked to me that “one of the mysteries of human intelligence is that we're able to do so much with so little.”” Like machines, human beings use algorithms to make decisions or solve problems; the remarkable difference lies in the human brain’s overall level of success despite the comparative limits on computational resources. The efficacy of human algorithms springs from what AI researchers refer to as “bounded optimality.” As psychologist Daniel Kahneman has notably pointed out, human beings are rational only up to a point. If you were perfectly rational, you would risk dropping dead before making an important decision—whom to hire, whom to marry, and so on—depending on the number of options available for your review. “With all of the successes of Al over the last few years, we’ve got good models of things like images and text, but what we’re missing are good models of people,” Tom says. “Human beings are still the best example we have of thinking machines. By identifying the quantity and the nature of the preconceptions that inform human cognition we can lay the groundwork for bringing computers even closer to human performance.” 91 HOUSE_OVERSIGHT_016894
THE ARTIFICIAL USE OF HUMAN BEINGS Tom Griffiths Tom Griffiths is Henry R. Luce Professor of Information, Technology, Consciousness, and Culture at Princeton University. He is co-author (with Brian Christian) of Algorithms to Live By. When you ask people to imagine a world that has successfully, beneficially incorporated advances in artificial intelligence, everybody probably comes up with a slightly different picture. Our idiosyncratic visions of the future might differ in the presence or absence of spaceships, flying cars, or humanoid robots. But one thing doesn’t vary: the presence of human beings. That’s certainly what Norbert Wiener imagined when he wrote about the potential of machines to improve human society by interacting with humans and helping to mediate their interactions with one another. Getting to that point doesn’t just require coming up with ways to make machines smarter. It also requires a better understanding of how human minds work. Recent advances in artificial intelligence and machine learning have resulted in systems that can meet or exceed human abilities in playing games, classifying images, or processing text. But if you want to know why the driver in front of you cut you off, why people vote against their interests, or what birthday present you should get for your partner, you’re still better off asking a human than a machine. Solving those problems requires building models of human minds that can be implemented inside a computer— something that’s essential not just to better integrate machines into human societies but to make sure that human societies can continue to exist. Consider the fantasy of having an automated intelligent assistant that can take on such basic tasks as planning meals and ordering groceries. To succeed in these tasks, it needs to be able to make inferences about what you want, based on the way you behave. Although this seems simple, making inferences about the preferences of human beings can be a tricky matter. For example, having observed that the part of the meal you most enjoy is dessert, your assistant might start to plan meals consisting entirely of desserts. Or perhaps it has heard your complaints about never having enough free time and observed that looking after your dog takes up a considerable amount of that free time. Following the dessert debacle, it has also understood that you prefer meals that incorporate protein, so it might begin to research recipes that call for dog meat. It’s not a long journey from examples like this to situations that begin to sound like problems for the future of humanity (all of whom are good protein sources). Making inferences about what humans want is a prerequisite for solving the AI problem of value alignment—aligning the values of an automated intelligent system with those of a human being. Value alignment is important if we want to ensure that those automated intelligent systems have our best interests at heart. If they can’t infer what we value, there’s no way for them to act in support of those values—and they may well act in ways that contravene them. Value alignment is the subject of a small but growing literature in artificial- intelligence research. One of the tools used for solving this problem is inverse- reinforcement learning. Reinforcement learning is a standard method for training intelligent machines. By associating particular outcomes with rewards, a machine- 92 HOUSE_OVERSIGHT_016895
learning system can be trained to follow strategies that produce those outcomes. Wiener hinted at this idea in the 1950s, but the intervening decades have developed it into a fine art. Modern machine-learning systems can find extremely effective strategies for playing computer games—from simple arcade games to complex real-time strategy games—by applying reinforcement-learning algorithms. Inverse reinforcement learning turns this approach around: By observing the actions of an intelligent agent that has already learned effective strategies, we can infer the rewards that led to the development of those strategies. In its simplest form, inverse reinforcement learning is something people do all the time. It’s so common that we even do it unconsciously. When you see a co-worker go to a vending machine filled with potato chips and candy and buy a packet of unsalted nuts, you infer that your co-worker (1) was hungry and (2) prefers healthy food. When an acquaintance clearly sees you and then tries to avoid encountering you, you infer that there’s some reason they don’t want to talk to you. When an adult spends a lot of time and money in learning to play the cello, you infer that they must really like classical music—whereas inferring the motives of a teenage boy learning to play an electric guitar might be more of a challenge. Inverse reinforcement learning is a statistical problem: We have some data—the behavior of an intelligent agent—and we want to evaluate various hypotheses about the rewards underlying that behavior. When faced with this question, a statistician thinks about the generative model behind the data: What data would we expect to be generated if the intelligent agent was motivated by a particular set of rewards? Equipped with the generative model, the statistician can then work backward: What rewards would likely have caused the agent to behave in that particular way? If you’re trying to make inferences about the rewards that motivate human behavior, the generative model is really a theory of how people behave—how human minds work. Inferences about the hidden causes behind the behavior of other people reflect a sophisticated model of human nature that we all carry around in our heads. When that model is accurate, we make good inferences. When it’s not, we make mistakes. For example, a student might infer that his professor is indifferent to him if the professor doesn’t immediately respond to his email—a consequence of the student’s failure to realize just how many emails that professor receives. Automated intelligent systems that will make good inferences about what people want must have good generative models for human behavior: that is, good models of human cognition expressed in terms that can be implemented on a computer. Historically, the search for computational models of human cognition is intimately intertwined with the history of artificial intelligence itself. Only a few years after Norbert Wiener published Ze Human Use of Human Beings, Logic Theorist, the first computational model of human cognition and also the first artificial-intelligence system, was developed by Herbert Simon, of Carnegie Tech, and Allen Newell, of the RAND Corporation. Logic Theorist automatically produced mathematical proofs by emulating the strategies used by human mathematicians. The challenge in developing computational models of human cognition is making models that are both accurate and generalizable. An accurate model, of course, predicts human behavior with a minimum of errors. A generalizable model can make predictions across a wide range of circumstances, including circumstances unanticipated by its 93 HOUSE_OVERSIGHT_016896
creators—for instance, a good model of the Earth’s climate should be able to predict the consequences of a rising global temperature even if this wasn’t something considered by the scientists who designed it. However, when it comes to understanding the human mind, these two goals—accuracy and generalizability—have long been at odds with each other. At the far extreme of generalizability are rational theories of cognition. These theories describe human behavior as a rational response to a given situation. A rational actor strives to maximize the expected reward produced by a sequence of actions—an idea widely used in economics precisely because it produces such generalizable predictions about human behavior. For the same reason, rationality is the standard assumption in inverse-reinforcement-learning models that try to make inferences from human behavior—perhaps with the concession that humans are not perfectly rational agents and sometimes randomly choose to act in ways unaligned with or even opposed to their best interests. The problem with rationality as a basis for modeling human cognition 1s that it is not accurate. In the domain of decision making, an extensive literature—spearheaded by the work of cognitive psychologists Daniel Kahneman and Amos Tversky—has documented the ways in which people deviate from the prescriptions of rational models. Kahneman and Tversky proposed that in many situations people instead follow simple heuristics that allow them to reach good solutions at low cognitive cost but sometimes result in errors. To take one of their examples, if you ask somebody to evaluate the probability of an event, they might rely on how easy it is to generate an example of such an event from memory, consider whether they can come up with a causal story for that event’s occurring, or assess how similar the event is to their expectations. Each heuristic is a reasonable strategy for avoiding complex probabilistic computations, but also results in errors. For instance, relying on the ease of generating an event from memory as a guide to its probability leads us to overestimate the chances of extreme (hence extremely memorable) events such as terrorist attacks. Heuristics provide a more accurate model of human cognition but one that is not easily generalizable. How do we know which heuristic people might use in a particular situation? Are there other heuristics they use that we just haven’t discovered yet? Knowing exactly how people will behave in a new situation is a challenge: Is this situation one in which they would generate examples from memory, come up with causal stories, or rely on similarity? Ultimately, what we need is a way to describe how human minds work that has the generalizability of rationality and the accuracy of heuristics. One way to achieve this goal is to start with rationality and consider how to take it in a more realistic direction. A problem with using rationality as a basis for describing the behavior of any real-world agent is that, in many situations, calculating the rational action requires the agent to possess a huge amount of computational resources. It might be worth expending those resources 1f you’re making a highly consequential decision and have a lot of time to evaluate your options, but most human decisions are made quickly and for relatively low stakes. In any situation where the time you spend making a decision is costly—at the very least because it’s time you could spend doing something else—the classic notion of rationality is no longer a good prescription for how one should behave. To develop a more realistic model of rational behavior, we need to take into 94 HOUSE_OVERSIGHT_016897
account the cost of computation. Real agents need to modulate the amount of time they spend thinking by the effect the extra thought has on the results of a decision. If you’re trying to choose a toothbrush, you probably don’t need to consider all four thousand listings for manual toothbrushes on Amazon.com before making a purchase: You trade off the time you spend looking with the difference it makes in the quality of the outcome. This trade-off can be formalized, resulting in a model of rational behavior that artificial- intelligence researchers call “bounded optimality.” The bounded-optimal agent doesn’t focus on always choosing exactly the right action to take but rather on finding the right algorithm to follow in order to find the perfect balance between making mistakes and thinking too much. Bounded optimality bridges the gap between rationality and heuristics. By describing behavior as the result of a rational choice about how much to think, it provides a generalizable theory—that is, one that can be applied in new situations. Sometimes the simple strategies that have been identified as heuristics that people follow turn out to be bounded-optimal solutions. So, rather than condemning the heuristics that people use as irrational, we can think of them as a rational response to constraints on computation. Developing bounded optimality as a theory of human behavior is an ongoing project that my research group and others are actively pursuing. If these efforts succeed, they will provide us with the most important ingredient we need for making artificial- intelligence systems smarter when they try to interpret people’s actions, by enabling a generative model for human behavior. Taking into account the computational constraints that factor into human cognition will be particularly important as we begin to develop automated systems that aren’t subject to the same constraints. Imagine a superintelligent AI system trying to figure out what people care about. Curing cancer or confirming the Riemann hypothesis, for instance, won’t seem, to such an AI, like things that are all that important to us: If these solutions are obvious to the superintelligent system, it might wonder why we haven’t found them ourselves, and conclude that those problems don’t mean much to us. If we cared and the problems were so simple, we would have solved them already. A reasonable inference would be that we do science and math purely because we enjoy doing science and math, not because we care about the outcomes. Anybody who has young children can appreciate the problem of trying to interpret the behavior of an agent that is subject to computational constraints different from one’s own. Parents of toddlers can spend hours trying to disentangle the true motivations behind seemingly inexplicable behavior. As a father and a cognitive scientist, I found it was easier to understand the sudden rages of my two-year-old when I recognized that she was at an age where she could appreciate that different people have different desires but not that other people might not know what her own desires were. It’s easy to understand, then, why she would get annoyed when people didn’t do what she (apparently transparently) wanted. Making sense of toddlers requires building a cognitive model of the mind of a toddler. Superintelligent AI systems face the same challenge when trying to make sense of human behavior. Superintelligent AI may still be a long way off. In the short term, devising better models of people can prove extremely valuable to any company that makes money by analyzing human behavior—which at this point is pretty much every company that does business on the Web. Over the last few years, significant new commercial technologies 55 HOUSE_OVERSIGHT_016898
for interpreting images and text have resulted from developing good models for vision and language. Developing good models of people is the next frontier. Of course, understanding how human minds work isn’t just a way to make computers better at interacting with people. The trade-off between making mistakes and thinking too much that characterizes human cognition is a trade-off faced by any real- world intelligent agent. Human beings are an amazing example of systems that act intelligently despite significant computational constraints. We’re quite good at developing strategies that allow us to solve problems pretty well without working too hard. Understanding how we do this will be a step toward making computers work smarter, not harder. 96 HOUSE_OVERSIGHT_016899
Romanian-born Anca Dragan’s research focuses on algorithms that will enable robots to work with, around, and in support of people. She runs the InterACT Laboratory at Berkeley, where her students work across different applications, from assistive robots to manufacturing to autonomous cars, and draw from optimal control, planning, estimation, learning, and cognitive science. Barely into her thirties herself, she has co-authored a number of papers with her veteran Berkeley colleague and mentor Stuart Russell which address various aspects of machine learning and the knotty problems of value alignment. She shares Stuart’s preoccupation with AI safety: “An immediate risk is agents producing unwanted, surprising behavior,” she told an interviewer from the Future of Life Institute. “Even ifwe plan to use AI for good, things can go wrong, precisely because we are bad at specifying objectives and constraints for AI agents. Their solutions are often not what we had in mind.” Her principal goal is therefore to help robots and programmers alike to overcome the many conflicts that arise because of a lack of transparency about each other's intentions. Robots, she says, need to ask us questions. They should wonder about their assignments, and they should pester their human programmers until everybody is on the same page—so as to avoid what she has euphemistically called “unexpected side effects.” 97 HOUSE_OVERSIGHT_016900
PUTTING THE HUMAN INTO THE AI EQUATION Anca Dragan Anca Dragan is an assistant professor in the Department of Electrical Engineering and Computer Sciences at UC Berkeley. She co-founded and serves on the steering committee for the Berkeley AI Research (BAIR) Lab and is a co-principal investigator in Berkeley’s Center for Human-Compatible Al. At the core of artificial intelligence is our mathematical definition of what an AI agent (a robot) is. When we define a robot, we define states, actions, and rewards. Think of a delivery robot, for instance. States are locations in the world, and actions are motions that the robot makes to get from one position to a nearby one. To enable the robot to decide on which actions to take, we define a reward function—a mapping from states and actions to scores indicating how good that action was in that state—and have the robot choose actions that accumulate the most “reward.” The robot gets a high reward when it reaches its destination, and it incurs a small cost every time it moves; this reward function incentivizes the robot to get to the destination as quickly as possible. Similarly, an autonomous car might get a reward for making progress on its route and incur a cost for getting too close to other cars. Given these definitions, a robot’s job is to figure out what actions it should take in order to get the highest cumulative reward. We’ve been working hard in AI on enabling robots to do just that. Implicitly, we’ve assumed that if we’re successful—if robots can take any problem definition and turn into a policy for how to act—we will get robots that are useful to people and to society. We haven’t been too wrong so far. If you want an AI that classifies cells as either cancerous or benign, or a robot that vacuums the living room rug while you’re at work, we've got you covered. Some real-world problems can indeed be defined in isolation, with clear-cut states, actions, and rewards. But with increasing AI capability, the problems we want to tackle don’t fit neatly into this framework. We can no longer cut off a tiny piece of the world, put it in a box, and give it to a robot. Helping people is starting to mean working in the real world, where you have to actually interact with people and reason about them. “People” will have to formally enter the AI problem definition somewhere. Autonomous cars are already being developed. They will need to share the road with human-driven vehicles and pedestrians and learn to make the trade-off between getting us home as fast as possible and being considerate of other drivers. Personal assistants will need to figure out when and how much help we really want and what types of tasks we prefer to do on our own versus what we can relinquish control over. A DSS (Decision Support System) or a medical diagnostic system will need to explain its recommendations to us so we can understand and verify them. Automated tutors will need to determine what examples are informative or illustrative—not to their fellow machines but to us humans. Looking further into the future, if we want highly capable Als to be compatible with people, we can’t create them in isolation from people and then try to make them compatible afterward; rather, we’ll have to define “human-compatible” AI from the get- go. People can’t be an afterthought. 98 HOUSE_OVERSIGHT_016901
When it comes to real robots helping real people, the standard definition of AI fails us, for two fundamental reasons: First, optimizing the robot’s reward function in isolation is different from optimizing it when the robot acts around people, because people take actions too. We make decisions in service of our own interests, and these decisions dictate what actions we execute. Moreover, we reason about the robot—that is, we respond to what we think it’s doing or will do and what we think its capabilities are. Whatever actions the robot decides on need to mesh well with ours. This is the coordination problem. Second, it is ultimately a human who determines what the robot’s reward function should be in the first place. And they are meant to incentivize robot behavior that matches what the end-user wants, what the designer wants, or what society as a whole wants. I believe that capable robots that go beyond very narrowly defined tasks will need to understand this to achieve compatibility with humans. This is the va/ue-alignment problem. The Coordination Problem: People are more than objects in the environment. When we design robots for a particular task, it’s tempting to abstract people away. A robotic personal assistant, for example, needs to know how to move to pick up objects, so we define that problem in isolation from the people for whom the robot is picking these objects up. Still, as the robot moves around, we don’t want it bumping into anything, and that includes people, so we might include the physical location of the person in the definition of the robot’s state. Same for cars: We don’t want them colliding with other cars, so we enable them to track the positions of those other cars and assume that they’ ll be moving consistently in the same direction in the future. A human being, in this sense, is no different to a robot from a ball rolling on a flat surface. The ball will behave in the next few seconds the same way it behaved in the past few; it keeps rolling in the same direction at roughly the same speed. This is of course nothing like real human behavior, but such simplification enables many robots to succeed in their tasks and, for the most part, stay out of people’s way. A robot in your house, for example, might see you coming down the hall, move aside to let you pass, and resume its task once you’ve gone by. As robots have become more capable, though, treating people as consistently moving obstacles is starting to fall short. A human driver switching lanes won’t continue in the same direction but will move straight ahead once they’ ve made the lane change. When you reach for something, you often reach around other objects and stop when you get to the one you want. When you walk down a hallway, you have a destination in mind: You might take a right into the bedroom or a left into the living room. Relying on the assumption that we’re no different from a rolling ball leads to inefficiency when the robot stays out of the way if it doesn’t need to, and it can imperil the robot when the person’s behavior changes. Even just to stay out of the way, robots have to be somewhat accurate at anticipating human actions. And, unlike the rolling ball, what people will do depends on what they decide to do. So to anticipate human actions, robots need to start understanding human decision making. And that doesn’t mean assuming that human behavior is perfectly optimal; that might be enough for a chess- or Go-playing robot, but in the real world, people’s decisions are less predictable than the optimal move in a board game. 99 HOUSE_OVERSIGHT_016902
This need to understand human actions and decisions applies to physical and nonphysical robots alike. If either sort bases its decision about how to act on the assumption that a human will do one thing but the human does something else, the resulting mismatch could be catastrophic. For cars, it can mean collisions. For an AI with, say, a financial or economic role, the mismatch between what it expects us to do and what we actually do could have even worse consequences. One alternative is for the robot not to predict human actions but instead just protect against the worst-case human action. Often when robots do that, though, they stop being all that useful. With cars, this results in being stuck, because it makes every move too risky. All this puts us, the AI community, into a bind. It suggests that robots will need accurate (or at least reasonable) predictive models of whatever people might decide to do. Our state definition can’t just include the physical position of humans in the world. Instead, we’ ll also need to estimate something internal to people. We’ll need to design robots that account for this human internal state, and that’s a tall order. Luckily, people tend to give robots hints as to what their internal state is: Their ongoing actions give the robot observations (in the Bayesian inference sense) about their intentions. If we start walking toward the right side of the hallway, we’re probably going to enter the next room on the right. What makes the problem more complicated 1s the fact that people don’t make decisions in isolation. It would be one thing if robots could predict the actions a person intends to take and simply figure out what to doin response. But unfortunately this can lead to ultra-defensive robots that confuse the heck out of people. (Think of human drivers stuck at four-way stops, for instance.) What the intent-prediction approach misses is that the moment the robot acts, that influences what actions the human starts taking. There is a mutual influence between robots and people, one that robots will need to learn to navigate. It is not always just about the robot planning around people; people plan around the robot, too. It is important for robots to account for this when deciding which actions to take, be it on the road, in the kitchen, or even in virtual spaces, where actions might be making a purchase or adopting a new strategy. Doing so should endow robots with coordination strategies, enabling them to take part in the negotiations people seamlessly carry out day to day—from who goes first at an intersection or through a narrow door, to what role we each take when we collaborate on preparing breakfast, to coming to consensus on what next step to take on a project. Finally, just as robots need to anticipate what people will do next, people need to do the same with robots. This is why transparency is important. Not only will robots need good mental models of people, but people will need good mental models of robots. The model that a person has of the robot has to go into our state definition as well, and the robot has to be aware of how its actions are changing that model. Much like the robot treating human actions as clues to human internal states, people will change their beliefs about the robot as they observe its actions. Unfortunately, the giving of clues doesn’t come as naturally to robots as it does to humans; we’ve had a lot of practice communicating implicitly with people. But enabling robots to account for the change that their actions are causing to the person’s mental model of the robot can lead to more carefully chosen actions that do give the right clues—that clearly communicate to people about the robot’s intentions, its reward function, its limitations. For instance, a robot 100 HOUSE_OVERSIGHT_016903



































































































































































































































