might alter its motion when carrying something heavy, to emphasize the difficulty it has in maneuvering heavy objects. The more that people know about the robot, the easier it is to coordinate with it. Achieving action compatibility will require robots to anticipate human actions, account for how those actions will influence their own, and enable people to anticipate robot actions. Research has ,ade a degree of progress in meeting these challenges, but we still have a long way to go. The Value Alignment Problem: People hold the key to the robot’s reward function. Progress on enabling robots to optimize reward puts more burden on us, the designers, to give them the right reward to optimize in the first place. The original thought was that for any task we wanted the robot to do, we could write down a reward function that incentivizes the right behavior. Unfortunately, what often happens is that we specify some reward function and the behavior that emerges out of optimizing it isn’t what we want. Intuitive reward functions, when combined with unusual instances of a task, can lead to unintuitive behavior. You reward an agent in a racing game with a score in the game, and in some cases it finds a loophole that it exploits to gain infinitely many points without actually winning the race. Stuart Russell and Peter Norvig give a beautiful example in their book Artificial Intelligence: A Modern Approach: rewarding a vacuuming robot for how much dust it sucks in results in the robot deciding to dump out dust so that it can suck it in again and get more reward. In general, humans have had a notoriously difficult time specifying exactly what they want, as exemplified by all those genie legends. An AI paradigm in which robots get some externally specified reward fails when that reward is not perfectly well thought out. It may incentivize the robot to behave in the wrong way and even resist our attempts to correct its behavior, as that would lead to a lower specified reward. A seemingly better paradigm might be for robots to optimize for what we internally want, even if we have trouble explicating it. They would use what we say and do as evidence about what we want, rather than interpreting it literally and taking it as a given. When we write down a reward function, the robot should understand that we might be wrong: that we might not have considered all facets of the task; that there’s no guarantee that said reward function will a/ways lead to the behavior we want. The robot should integrate what we wrote down into its understanding of what we want, but it should also have a back-and-forth with us to elicit clarifying information. It should seek our guidance, because that’s the only way to optimize the true desired reward function. Even if we give robots the ability to learn what we want, an important question remains that AI alone won’t be able to answer. We can make robots try to align with a person’s internal values, but there’s more than one person involved here. The robot has an end-user (or perhaps a few, like a personal robot caring for a family, a car driving a few passengers to different destinations, or an office assistant for an entire team); it has a designer (or perhaps a few); and it interacts with society—the autonomous car shares the road with pedestrians, human-driven vehicles, and other autonomous cars. How to combine these people’s values when they might be in conflict is an important problem we need to solve. AI research can give us the tools to combine values in any way we decide but can’t make the necessary decision for us. 101 HOUSE_OVERSIGHT_016904
In short, we need to enable robots to reason about us—to see us as something more than obstacles or perfect game players. We need them to take our human nature into account, so that they are well coordinated and well aligned with us. If we succeed, we will indeed have tools that substantially increase our quality of life. 102 HOUSE_OVERSIGHT_016905
Chris Anderson’s company, 3DR, helped start the modern drone industry and now focuses on drone data software. He got his start building an open-source aerial robotics community called DIY Drones, and undertook some ill-advised early experiments, such as buzzing Lawrence Berkeley Laboratory with one of his self-flying spies. It may well have been a case of antic gene-expression, since he’s descended from a founder of the American Anarchist movement. Chris ran Wired magazine, a go-to publication for techno-utopians and -dystopians alike, from 2001 to 2012; during his tenure it won five National Magazine Awards. Chris dislikes the term “roboticist” (“like any properly humbled roboticist, I don’t call myself one”). He began as a physicist. “I turned out to be a bad physicist,” he told me recently. “T struggled on, went to Los Alamos, and thought, ‘Well maybe I’m not going to be a Nobel Prize winner, but I can still be a scientist.’ All of us who were in Physics and had these romantic heroes—the Feynmans, the Manhattan Project—realized that our career trajectory would at best be working on one project at CERN for fifteen years. That project would either be a failure, in which case there would be no paper, or it would be a success, in which case you'd be author #300 on the paper and become an assistant professor at Iowa State. “Most of my classmates went to Wall Street to become quants, and to them we owe the subprime mortgage. Others went on to start the Internet. First, we built the Internet by connecting physics labs; second, we built the Web; third, we were the first to do Big Data. We had supercomputers—Crays—which were half the power of your phone now, but they were the supercomputers of the time. Meanwhile, we were reading this magazine called Wired, which came out in 1993, and we realized that this tool we scientists use could have applications for everybody. The Internet wasn’t just about scientific data, it was a mind-blowing cultural revolution. So when Conde Nast asked me to take over the magazine, I was like, ‘Absolutely!’ This magazine changed my life.” He had five children by that time—video-game players—who got him into the “flving robots.” He quit his day job at Wired. The rest is Silicon Valley history. 103 HOUSE_OVERSIGHT_016906
GRADIENT DESCENT Chris Anderson Chris Anderson is an entrepreneur; former editor-in-chief of Wired; co-founder and CEO of 3DR; and author of The Long Tail, Free, and Makers. Life The mosquito first detects my scent from thirty feet away. It triggers its pursuit function, which consists of the simplest possible rules. First, move in a random direction. If the scent increases, continue moving in that direction. If the scent decreases, move in the opposite direction. If the scent is lost, move sideways until a scent is picked up again. Repeat until contact with the target is achieved. The plume of my scent is densest next to me and disperses as it spreads, an invisible fog of particles exuded from my skin that moves like smoke with the wind. The closer to my skin, the higher the particle density; the farther away, the lower. This decrease 1s called a gradient, which describes any gradual transition from one level to another one—as opposed to a “step function,” which describes a discrete change. Once the mosquito follows this gradient to its source using its simple algorithm, it lands on my skin, which it senses with the heat detectors in its feet, which are attuned to another gradient—temperature. It then pushes its needle-shaped proboscis through the surface, where a third set of sensors in the tip detect yet another gradient, that of blood density. This flexible needle wriggles around under my skin until the scent of blood steers it to a capillary, which it punctures. Then my blood begins to flow into the mosquito. Mission accomplished. Ouch. What seems like the powerful radar of insects in the dark, with blood-seeking intelligence inexplicable for such tiny brains, is actually just a sensitive nose with almost no intelligence at all. Mosquitoes are closer to plants that follow the sun than to guided missiles. Yet by applying this simple “follow your nose” rule quite literally, they can travel through a house to find you, slip through cracks in a screen door, even zero in on the tiny strip of skin you left exposed between hat and shirt collar. It’s just a random walk, combined with flexible wings and legs that let the insect bounce off obstacles, and an instinct to descend a chemical gradient. But “gradient descent” is much more than bug navigation. Look around you and you'll find it everywhere, from the most basic physical rules of the universe to the most advanced artificial intelligence. The Universe We live in a world of countless gradients, from light and heat to gravity and chemical trails (chemtrails!). Water flows along a gravity gradient downhill, and your body lives on chemical solutions flowing across cell membranes from high concentration to low. Every action in the universe 1s driven by some gradient drive, from the movement of the planets around gravity gradients to the joining of atoms along electric-charge gradients to form molecules. Our own urges, such as hunger and sleepiness, are driven by electro-chemical gradients in our bodies. And our brain’s functions, the electrical signals moving along ion channels in the synapses between our neurons, are simply atoms and electrons flowing “downhill” along yet more electrical and chemical gradients. 104 HOUSE_OVERSIGHT_016907
Forget clockwork analogies; our brains are closer to a system of canals and locks, with signals traveling like water from one state to another. As I sit here typing, I’m actually seeking equilibrium states in an n-dimensional topology of gradients. Take just one: heat. My body temperature is higher than the air temperature, so I radiate heat, which must be replenished in my core. Even the bacteria in my digestive tract use sensors to measure sugar concentrations in the liquid around them and whip their tail-like flagella to swim “upstream” where the sugar supply is richest. The natural state of all systems is to flow to lower energy states, a process that is broadly described by entropy (the tendency of things to go from ordered to disordered states; all things will fall apart eventually, including the universe itself). But how do you explain more complex behavior, such as our ability to make decisions? The answer is just more gradient descent. Our Brains As miraculous and inscrutable as our human intelligence is, science is coming around to the view that our brains operate the same way as any other complex system with layers and feedback loops, all pursuing what we mathematically call “optimization functions” but you could just as well call “flowing downhill” in some sense. The essence of intelligence is learning, and we do that by correlating inputs with positive or negatives scores (rewards or punishment). So, for a baby, “this sound” (your mother’s voice) is associated with other learned connections to your mother, such as food or comfort. Likewise, “this muscle motion brings my thumb closer to my mouth.” Over time and trial and error, the brain’s neural network reinforces those connections. Meanwhile “this muscle motion does not bring my thumb close to my mouth” is a negative correlation, and the brain will weaken those connections. However, this is too simplistic. The limits of gradient descent constitute the so- called local-minima problem (or local-maxima problem, if you’re doing a gradient ascent). If you are walking in a mountainous region and want to get home, always walking downhill will most likely get you to the next valley but not necessarily over the other mountains that lie around it and between you and home. For that, you need either a mental model (i.e., a map) of the topology so you know where to ascend to get out of the valley, or you need to switch between gradient descent and random walks so you can bounce your way out of the region. Which 1s, in fact, exactly what the mosquito does in following my scent: It descends when it’s in my plume and random-walks when it has lost the trail or hit an obstacle. Al So that’s nature. What about computers? Traditional software doesn’t work that way—it follows deterministic trees of hard logic: “If this, do that.” But software that interacts with the physical world tends to work more /ike the physical world. That means dealing with noisy inputs (sensors or human behavior) and providing probabilistic, not deterministic, results. And that, in turn, means more gradient descent. AI software is the best example of this, especially the kinds of AI that use artificial neural-network models (including convolutional, or “deep,” neural networks of many layers). In these, a typical process consists of “training” them by showing them 105 HOUSE_OVERSIGHT_016908
lots of examples of something you want them to learn (pictures of cats labeled “cat,” for example), along with examples of other random data (pictures of other things). This is called “supervised learning,” because the neural network is being taught by example, including the use of “adversarial training” with data that is not correlated to the desired result. These neural networks, like their biological models, consist of layers of thousands of nodes (“neurons,” in the analogy), each of which is connected to all the nodes in the layers above and below by connections that initially have random strength. The top layer is presented with data, and the bottom layer is given the correct answer. Any series of connections that happened to land on the right answer is made stronger (“rewarded”), and those that were wrong are made weaker (“punished”). Repeat tens of thousands of times and eventually you have a fully trained network for that kind of data. You can think of all the possible combinations of connections as like the surface of a planet, with hills and valleys. (Ignore for the moment that the surface is just 3D and the actual topology is many-dimensional.) The optimization that the network goes through as it learns is just a process of finding the deepest valley on the planet. This consists of the following steps: 1. Define a “cost function” that determines how well the network solved the problem 2. Run the network once and see how it did at that cost function 3. Change the values of the connections and do it again. The difference between those two results is the direction, or “slope,” in which the network moved between the two trials. 4. Ifthe slope is pointed “downhill,” change the connections more in that direction. If it’s “uphill,” change them in the opposite direction. 5. Repeat until there is no improvement in any direction. That means that you’re in a minimum. Congrats! But it’s probably a /oca/ minimum, or a little dip in the mountains, so you’re going to have to keep going if you want to do better. You can’t keep going downhill, and you don’t know where the absolute lowest point is, so you’re going to have to somehow find it. There are many ways to do that, but here are a few: 1. Try lots of times with different random settings and share learning from each trial; essentially, you are shaking the system to see if it settles in a lower state. If one of the other trials found a lower valley, start with those settings. 2. Don’t just go downhill but stumble around a bit like a drunk, too (this is called “stochastic gradient descent”). If you do this long enough, you’ll eventually find rock bottom. There’s a metaphor for life in that. 3. Just look for “interesting” features, which are defined by diversity (edges or color changes, for example). Warning: This way can lead to madness—too much “interestingness” draws the network to optical illusions. So keep it sane, and emphasize the kinds of features that are likely to be real in nature, as opposed to artifacts or errors. This is called “regularization,” and there are lots of techniques for this, such as whether those kinds of features have been seen before (learned), 106 HOUSE_OVERSIGHT_016909
or are too “high frequency” (like static) rather than “low frequency” (more continuous, like actual real-world features). Just because AI systems sometimes end up in local minima, don’t conclude that this makes them any less like life. Humans—indeed, probably all life-forms—are often stuck in local minima. Take our understanding of the game of Go, which was taught and learned and optimized by humans for thousands of years. It took Als less than three years to find out that we’d been playing it wrong all along and that there were better, almost alien, solutions to the game which we’d never considered—mostly because our brains don’t have the processing power to consider so many moves ahead. Even in chess, which is ten times easier and was thought to be understood, brute- force machines could beat us at our own strategies. Chess, too, turned out, when explored by superior neural-network AI systems, to have weird but superior strategies we'd never considered, like sacrificing queens early to gain an obscure long-term advantage. It’s as if we had been playing 2D versions of games that actually existed in higher dimensions. If any of this sounds familiar, it’s because physics has been wrestling with these sorts of topological problems for decades. The notion of space being many-dimensional, and math reducing to understanding the geometries and interactions of “membranes” beyond the reach of our senses, is where Grand Unified Theorists go to die. But unlike multidimensional theoretical physics, AI is something we can actually experiment with and measure. So that’s what we’re going to do. The next few decades will be an explosive exploration of ways to think that 7 million years of evolution never found. We’re going to rock ourselves out of local minima and find deeper minima, maybe even global minima. And when we’re done, we may even have taught machines to seem as smart as a mosquito, forever descending the cosmic gradients to an ultimate goal, whatever that may be. 107 HOUSE_OVERSIGHT_016910
David Kaiser is a physicist atypically interested in the intersection of his science with politics and culture, about which he has written widely. In the first meeting (in Washington, Connecticut) that preceded the crafting of this book, he commented on the change in how “information” is viewed since Wiener’s time: the military-industrial, Cold War era. Back then, Wiener compared information, metaphorically, to entropy, in that it could not be conserved—i.e., monopolized; thus, he argued, our atomic secrets and other such classified matters would not remain secrets for long. Today, whereas (as Wiener might have expected) information, fake or not, is leaking all over the other Washington, information in the economic world has indeed been stockpiled, commodified, and monetized. This lockdown, David said, was “not all good, not all bad’”—depending, I guess, on whether you’re sick of being pestered by ads for socks or European river cruises popping up in your browser minutes after you’ve bought them. To say nothing of information’s proliferation. David complained to the rest of us attending the meeting that in Wiener’s time, physicists could “take the entire Physical Review. /t would sit comfortably in front of us in a manageable pile. Now we’re awash in fifty thousand open-source journals per minute,” full of god-knows-what. Neither of these developments would Wiener have anticipated, said David, prompting him to ask, “Do we need a new set of guiding metaphors?” 108 HOUSE_OVERSIGHT_016911
“INFORMATION” FOR WIENER, FOR SHANNON, AND FOR US David Kaiser David Kaiser is Germeshausen Professor of the History of Science and professor of Physics at MIT, and head of its Program in Science, Technology & Society. He is the author of How the Hippies Saved Physics: Science, Counterculture, and the Quantum Revival and American Physics and the Cold War Bubble (forthcoming). In The Sleepwalkers, a sweeping history of scientific thought from ancient times through the Renaissance, Arthur Koestler identified a tension that has marked the most dramatic leaps of our cosmological imagination. In reading the great works of Nicolaus Copernicus and Johannes Kepler today, Koestler argued, we are struck as much by their strange unfamiliarity—their embeddedness in the magic or mysticism of an earlier age— as by their modern-sounding insights. I detect that same doubleness—the zig-zag origami folds of old and new—in Norbert Wiener’s classic The Human Use of Human Beings. First published in 1950 and revised in 1954, the book is in many ways extraordinarily prescient. Wiener, the MIT polymath, recognized before most observers that “society can only be understood through a study of the messages and the communication facilities which belong to it.”. Wiener argued that feedback loops, the central feature of his theory of cybernetics, would play a determining role in social dynamics. Those loops would not only connect people with one another but connect people with machines, and—crucially—machines with machines. Wiener glimpsed a world in which information could be separated from its medium. People, or machines, could communicate patterns across vast distances and use them to fashion new items at the endpoints, without “moving a...particle of matter from one end of the line to the other,” a vision now realized in our world of networked 3D printers. Wiener also imagined machine-to-machine feedback loops driving huge advances in automation, even for tasks that had previously relied on human judgment. “The machine plays no favorites between manual labor and white-collar labor,” he observed. For all that, many of the central arguments in Zhe Human Use of Human Beings seem closer to the 19th century than the 21st. In particular, although Wiener made reference throughout to Claude Shannon’s then-new work on information theory, he seems not to have fully embraced Shannon’s notion of information as consisting of irreducible, meaning-free bits. Since Wiener’s day, Shannon’s theory has come to undergird recent advances in “Big Data” and “deep learning,” which makes it all the more interesting to revisit Wiener’s cybernetic imagination. How might tomorrow’s artificial intelligence be different if practitioners were to re-invest in Wiener’s guiding vision of “information”? When Wiener wrote Zhe Human Use of Human Beings, his experiences of war-related research, and of what struck him as the moral ambiguities of intellectual life amid the military-industrial complex, were still fresh. Just a few years earlier, he had announced 109 HOUSE_OVERSIGHT_016912
in the pages of The Atlantic Monthly that he would not “publish any future work of mine which may do damage in the hands of irresponsible militarists.”°° He remained ambivalent about the transformative power of new technologies, indulging in neither the boundless hype nor the digital utopianism of later pundits. “Progress imposes not only new possibilities for the future but new restrictions,” he wrote, in Human Use. He was concerned about human-made restrictions as well as technological ones, especially Cold War restrictions that threatened the flow of information so critical to cybernetic systems: “Under the impetus of Senator [Joseph] McCarthy and his imitators, the blind and excessive classification of military information” was driving political leaders in the United States to adopt a “secretive frame of mind paralleled in history only in the Venice of the Renaissance.” Wiener, echoing many outspoken veterans of the Manhattan Project, argued that the postwar obsession with secrecy—especially around nuclear weapons—stemmed from a misunderstanding of the scientific process. The only genuine secret about the production of nuclear weapons, he wrote, was whether such bombs could be built. Once that secret had been revealed, with the bombings of Hiroshima and Nagasaki, no amount of state-imposed secrecy would stop others from puzzling through chains of reasoning like those the Manhattan Project researchers had followed. As Wiener memorably put it, “There is no Maginot Line of the brain.” To drive this point home, Wiener borrowed Shannon’s fresh ideas about information theory. In 1948, Shannon, a mathematician and engineer working at Bell Labs, had published a pair of lengthy articles in the Be// System Technical Journal. Introducing the new work to a broad readership in 1949, mathematician Warren Weaver explained that in Shannon’s formulation, “the word information...is used in a special sense that must not be confused with its ordinary usage. In particular, information must not be confused with meaning.”?! Linguists and poets might be concerned about the “semantic” aspects of communication, Weaver continued, but not engineers like Shannon. Rather, “this word ‘information’ in communication theory relates not so much to what you do say, as to what you could say.” In Shannon’s now-famous formulation, the information content of a string of symbols was given by the logarithm of the number of possible symbols from which a given string was chosen. Shannon’s key insight was that the information of a message was just like the entropy of a gas: a measure of the system’s disorder. Wiener borrowed this insight when composing Human Use. If information was like entropy, then it could not be conserved—or contained. Physicists in the 19th century had demonstrated that the total energy of a physical system must always remain the same, a perfect balance between the start and the end of a process. Not so for entropy, which would inexorably increase over time, an imperative that came to be known as the second law of thermodynamics. From that stark distinction—energy is conserved, whereas entropy must grow—followed enormous cosmic consequences. Time must flow forward; 3° Norbert Wiener, “A Scientist Rebels,” Zhe Atlantic Monthly, January 1947. 31 Warren Weaver, “Recent Contributions to the Mathematical Theory of Communication,” in Claude Shannon & Warren Weaver, The Mathematical Theory of Communication (Urbana, IL: University of Illinois Press, 1949), p. 8 (emphasis in original). Shannon’s 1948 papers were republished in the same volume. 110 HOUSE_OVERSIGHT_016913
the future cannot be the same as the past. The universe could even be careening toward a “heat death,” some far-off time when the total stock of energy had uniformly dispersed, achieving a state of maximum entropy, after which no further change could occur. If information gua entropy could not be conserved, then Wiener concluded it was folly for military leaders to try to stockpile the “scientific know-how of the nation in static libraries and laboratories.” Indeed, “no amount of scientific research, carefully recorded in books and papers, and then put into our libraries with labels of secrecy, will be adequate to protect us for any length of time in a world where the effective level of information is perpetually advancing.” Any such efforts at secrecy, classification, or the containment of information would fail, Wiener argued, just as surely as hucksters’ schemes for perpetual-motion machines faltered in the face of the second law of thermodynamics. Wiener criticized the American “orthodoxy” of free-market fundamentalism in much the same way. For most Americans, “questions of information will be evaluated according to a standard American criterion: a thing is valuable as a commodity for what it will bring in the open market.” Indeed, “the fate of information in the typically American world is to become something which can be bought or sold;” most people, he observed, “cannot conceive of a piece of information without an owner.” Wiener considered this view to be as wrong-headed as rampant military classification. Again he invoked Shannon’s insight: Since “information and entropy are not conserved,” they are “equally unsuited to being commodities.” Information cannot be conserved—so far, so good. But did Wiener really have Shannon’s “information” in mind? The crux of Shannon’s argument, as Weaver had emphasized, was to distinguish a colloquial sense of “information,” as message with meaning, from an abstracted, rarefied notion of strings of symbols arrayed with some probability and selected from an enormous universe of gibberish. For Shannon, “information” could be quantified because its fundamental unit, the bit, was a unit of conveyance rather than understanding. When Wiener characterized “information” throughout Human Use, on the other hand, he tilted time and again to a classical, humanistic sense of the term. “A piece of information,” he wrote—tellingly, not a “bit” of information—“in order to contribute to the general information of the community, must say something substantially different from the community’s previous common stock of information.” This was why “schoolboys do not like Shakespeare,” he concluded: The Bard’s couplets may depart starkly from random bitstreams, but they had nonetheless become all too familiar to the sense-making public and “absorbed into the superficial clichés of the time.” At least the information content of Shakespeare had once seemed fresh. During the postwar boom years, Wiener fretted, the “enormous per capita bulk of communication”—ranging across newspapers and movies to radio, television, and books—had bred mediocrity, an informational reversion to the mean. “More and more we must accept a standardized inoffensive and insignificant product which, like the white bread of the bakeries, is made rather for its keeping and selling properties than for its food value.” “Heaven save us,” he pleaded, “from the first novels which are written because a young man desires the prestige of being a novelist rather than because he has 111 HOUSE_OVERSIGHT_016914
something to say! Heaven save us likewise from the mathematical papers which are correct and elegant but without body or spirit.” Wiener’s treatment of “information” sounded more like Matthew Arnold in 1869°? than Claude Shannon in 1948—more “body and spirit” than “bit.” Wiener shared Arnold’s Romantic view of the “content producer” as well. “Properly speaking the artist, the writer, and the scientist should be moved by such an irresistible impulse to create that, even if they were not being paid for their work, they would be willing to pay to get the chance to do it.” L’art pour Iart, that 19th-century cry: Artists should suffer for their work; the quest for meaningful expression should always trump lucre. To Wiener, this was the proper measure of “information”: body, spirit, aspiration, expression. Yet to argue against its commodification, Wiener reverted again to Shannon’s mathematics of information-as-entropy. Flash forward to our day. In many ways, Wiener has been proved right. His vision of networked feedback loops driven by machine-to-machine communication has become a mundane feature of everyday life. From the earliest stirrings of the Internet Age, moreover, digital piracy has upended the view that “information”—in the form of songs, movies, books, or code—could remain contained. Put up a paywall here, and the content will diffuse over there, all so much informational entropy that cannot be conserved. On the other hand, enormous multinational corporations—some of the largest and most profitable in the world—now routinely disprove Wiener’s contention that “information” cannot be stockpiled or monetized. Ironically, the “information” they trade in is closer to Shannon’s definition than Wiener’s, Shannon’s mathematical proofs notwithstanding. While Google Books may help circulate hundreds of thousands of works of literature for free, Google itself—like Facebook, Amazon, Twitter, and their many imitators—has commandeered a baser form of “information” and exploited it for extraordinary profit. Petabytes of Shannon-like information—a seemingly meaningless stream of clicks, “likes,” and retweets, collected from virtually every person who has ever touched a networked computer—are sifted through proprietary “deep-learning” algorithms to micro-target everything from the advertisements we see to the news stories (fake or otherwise) we encounter while browsing the Web. Back in the early 1950s, Wiener had proposed that researchers study the structures and limitations of ants—in contrast to humans—so that machines might one day achieve the “almost indefinite intellectual expansion” that people (rather than insects) can attain. He found solace in the notion that machines could come to dominate us only “in the last stages of increasing entropy,” when “the statistical differences among individuals are nil.” Today’s data-mining algorithms turn Wiener’s approach on its head. They produce profit by exploiting our reptilian brains rather than imitating our cerebral cortexes, harvesting information from all our late-night, blog-addled, pleasure-seeking clickstreams—leveraging precisely the tiny, residual “statistical differences among individuals.” 32 Matthew Amold, Culture and Anarchy, Jane Garnett, ed. (Oxford, U.K.: Oxford University Press, 2006). 112 HOUSE_OVERSIGHT_016915
To be sure, some recent achievements in artificial intelligence have been remarkably impressive. Computers can now produce visual artworks and musical compositions akin to those of recognized masters, creating just the sort of “information” that Wiener most prized. But by far the largest impact on society to date has come from the collection and manipulation of Shannon-like information, which has reshaped our shopping habits, political participation, personal relationships, expectations of privacy, and more. What might “deep learning” evolve into, if the fundamental currency becomes “information” as Wiener defined it? How might the field shift if re-animated by Wiener’s deep moral convictions, informed as they were by his prescient concerns about rampant militarism, runaway corporate profit-seeking, the self-limiting features of secrecy, and the reduction of human expression to interchangeable commodities? Perhaps “deep learning” might then become the cultivation of meaningful information rather than the relentless pursuit of potent, if meaningless, bits. 113 HOUSE_OVERSIGHT_016916
In the aforementioned Connecticut discussion on The Human Use of Human Beings, Neil Gershenfeld provided some fresh air, of a kind, by professing that he hated the book, which remark was met by universal laughter—as was his observation that computer science was one the worst things to happen to computers, or science. His overall contention was that Wiener missed the implications of the digital revolution that was happening around him—although some would say this charge can’t be leveled at someone on the ground floor and lacking clairvoyance. “The tail wagging the dog of my life,” he told us, “has been Fab Labs and the maker movement, and [when] Wiener talks about the threat of automation he misses the inverse, Which is that access to the means for automation can empower people, and in Fab Labs, the corner I’ve been involved in, that’s an exponential.” In 2003, I visited Neil at MIT, where he runs the Center for Bits and Atoms. Hours later, I emerged from what had been an exuberant display of very weird stuff. He showed me the work of one student in his popular rapid-prototyping class (“How to Make Almost Anything”), a sculptor with no engineering background, who had made a portable personal space for screaming that saves up your screams and plays them back later. Another student in the class had made a Web browser that lets parrots navigate the Net. Neil himself was doing fundamental research on the roadmap to that sci-fi staple, a “universal replicator.” It was a visit that took me a couple of years to get my head around. Neil manages a global network of Fab Labs—small-scale manufacturing systems, enabled by digital technologies, which give people the wherewithal to build whatever they'd like. As guru of the maker movement, which merges digital communication and computation with fabrication, he sometimes feels outside the current heated debate on AI safety. “My ability to do research rests on tools that augment my capabilities,” he says. “Asking whether or not they are intelligent is as fruitful as asking how I know I exist— amusing philosophically, but not testable empirically.”” What interests him is “how bits and atoms relate—the boundary between digital and physical. Scientifically, it’s the most exciting thing I know.” 114 HOUSE_OVERSIGHT_016917
SCALING Neil Gershenfeld Neil Gershenfeld is a physicist and director of MIT’s Center for Bits and Atoms. He is the author of FAB, co-author (with Alan Gershenfeld & Joel Cutcher-Gershenfeld) of Designing Reality, and founder of the global fab lab network. Discussions about artificial intelligence have been oddly ahistorical. They could better be described as manic-depressive; depending on how you count, we’re now in the fifth boom-bust cycle. Those swings mask the continuity in the underlying progress and the implications for where it’s headed. The cycles have come in roughly decade-long waves. First there were mainframes, which by their very existence were going to automate away work. That ran into the reality that it was hard to write programs to do tasks that were simple for people to do. Then came expert systems, which were going to codify and then replace the knowledge of experts. These ran into difficulty in assembling that knowledge and reasoning about cases not already covered. Perceptrons sought to get around these problems by modeling how the brain learns, but they were unable to do much of anything. Multilayer perceptrons could handle test problems that had tripped up those simpler networks, but their demonstrations did poorly on unstructured, real-world problems. We’re now in the deep-learning era, which is delivering on many of the early AI promises but 1n a way that’s considered hard to understand, with consequences ranging from intellectual to existential threats. Each of these stages was heralded as a revolutionary advance over the limitations of its predecessors, yet all effectively do the same thing: They make inferences from observations. How these approaches relate can be understood by how they scale—that is, how their performance depends on the difficulty of the problem they’re addressing. Both a light switch and a self-driving car must determine their operator’s intentions, but the former has just two options to choose from, whereas the latter has many more. The AI- boom phases have started with promising examples in limited domains; the bust phases came with the failure of those demonstrations to handle the complexity of less-structured, practical problems. Less apparent is the steady progress we’ve made in mastering scaling. This progress rests on the technological distinction between linear and exponential functions— a distinction that was becoming evident at the dawn of AI but with implications for AI that weren’t appreciated until many years later. In one of the founding documents of the study of intelligent machines, 7he Human Use of Human Beings, Norbert Wiener does a remarkable job of identifying many of the most significant trends to arise since he wrote it, along with noting the people responsible for them and then consistently failing to recognize why these people’s work proved to be so important. Wiener is credited with creating the field of cybernetics; ve never understood what that is, but what’s missing from the book is at the heart of how AI has progressed. This history matters because of the echoes of it that persist to this day. Claude Shannon makes a cameo appearance in the book, in the context of his thoughts about the prospects for a chess-playing computer. Shannon was doing 115 HOUSE_OVERSIGHT_016918
something much more significant than speculating at the time: He was laying the foundations for the digital revolution. As a graduate student at MIT, he worked for Vannevar Bush on the Differential Analyzer. This was one of the last great analog computers, a room full of gears and shafts. Shannon’s frustration with the difficulty of solving problems this way led him in 1937 to write what might be the best master’s thesis ever. In it, he showed how electrical circuits could be designed to evaluate arbitrary logical expressions, introducing the basis for universal digital logic. After MIT, Shannon studied communications at Bell Labs. Analog telephone calls degraded with distance; the farther they traveled, the worse they sounded. Rather than continue to improve them incrementally, Shannon showed in 1948 that by communicating with symbols rather than continuous quantities, the behavior is very different. Converting speech waveforms to the binary values of 1 and 0 is an example, but many other sets of symbols can be (and are) used in digital communications. What matters is not the particular symbols but rather the ability to detect and correct errors. Shannon found that if the noise is above a threshold (which depends on the system design), then there are certain to be errors. But if the noise is below a threshold, then a linear increase in the physical resources representing the symbol results in an exponential decrease in the likelihood of making an error in correctly receiving the symbol. This relationship was the first of what we’d now call a threshold theorem. Such scaling falls off so quickly that the probability of an error can be so small as to effectively never happen. Each symbol sent multiplies rather than adds to the certainty, so that the probability of a mistake can go from 0.1 to 0.01 to 0.001, and so forth. This exponential decrease in communication errors made possible an exponential increase in the capacity of communication networks. And that eventually solved the problem of where the knowledge in an AI system came from. For many years, the fastest way to speed up a computation was to do nothing— just wait for computers to get faster. In the same way, there were years of AI projects that aimed to accumulate everyday knowledge by laboriously entering pieces of information. That didn’t scale; it could progress only as fast as the number of people doing the entering. But when phone calls, newspaper stories, and mail messages all moved onto the Internet, everyone doing any of those things became a data generator. The result was an exponential rather than a linear rate of knowledge accumulation. John von Neumann also has a cameo in Zhe Human Use of Human Beings, for game theory. What Wiener missed here was von Neumann’s seminal role in digitizing computation. Whereas analog communication degraded with distance, analog computing (like the Differential Analyzer) degraded with time, accumulating errors as it progressed. Von Neumann presented in 1952 a result corresponding to Shannon’s for computation (they had met at the Institute for Advanced Study, in Princeton), showing that it was possible to compute reliably with an unreliable computing device by using symbols rather than continuous quantities. This was, again, a scaling argument, with a linear increase in the physical resources representing the symbol resulting in an exponential reduction in the error rate as long as the noise was below a threshold. That’s what makes it possible to have a billion transistors in a computer chip, with the last one as useful as the first one. This relationship led to an exponential increase in computing performance, which solved a second problem in AI: how to process exponentially increasing amounts of data. The third problem that scaling solved for AI was coming up with the rules for 116 HOUSE_OVERSIGHT_016919
reasoning without having to hire a programmer for each problem. Wiener recognized the role of feedback in machine learning, but he missed the key role of representation. It’s not possible to store all possible images in a self-driving car, or all possible sounds in a conversational computer; they have to be able to generalize from experience. The “deep” part of deep learning refers not to the (hoped-for) depth of insight but to the depth of the mathematical network layers used to make predictions. It turned out that a linear increase in network complexity led to an exponential increase in the expressive power of the network. If you lose your keys in a room, you can search for them. If you’re not sure which room they’re in, you have to search all the rooms in a building. If you’re not sure which building they’re in, you have to search all the rooms in all the buildings in a city. If you’re not sure which city they’re in, you have to search all the rooms in all the buildings in all the cities. In AI, finding the keys corresponds to things like a car safely following the road, or a computer correctly interpreting a spoken command, and the rooms and buildings and cities correspond to all of the options that have to be considered. This is called the curse of dimensionality. The solution to the curse of dimensionality came in using information about the problem to constrain the search. The search algorithms themselves are not new. But when applied to a deep-learning network, they adaptively build up representations of where to search. The price of this is that it’s no longer possible to exactly solve for the best answer to a problem, but typically all that’s needed is an answer that’s good enough. Taken together, it shouldn’t be surprising that these scaling laws have allowed machines to become effectively as capable as the corresponding stages of biological complexity. Neural networks started out with a goal of modeling how the brain works. That goal was abandoned as they evolved into mathematical abstractions unrelated to how neurons actually function. But now there’s a kind of convergence that can be thought of as forward- rather than reverse-engineering biology, as the results of deep learning echo brain layers and regions. One of the most difficult research projects P ve managed paired what we’d now call data scientists with AI pioneers. It was a miserable experience in moving goalposts. As the former progressed in solving long-standing problems posed by the latter, this was deemed to not count because it wasn’t accompanied by corresponding leaps in understanding the solutions. What’s the value of a chess-playing computer if you can’t explain how it plays chess? The answer of course is that it can play chess. There is interesting emerging research that is applying AI to AI—that is, training networks to explain how they operate. But both brains and computer chips are hard to understand by watching their inner workings; they’re easily interpreted only by observing their external interfaces. We come to trust (or not) brains and computer chips alike based on experience that tests them rather than on explanations for how they work. Many branches of engineering are making a transition from what’s called imperative to declarative or generative design. This means that instead of explicitly designing a system with tools like CAD files, circuit schematics, and computer code, you describe what you want the system to do and then an automated search is done for designs that satisfy your goals and restrictions. This approach becomes necessary as design complexity exceeds what can be understood by a human designer. While that 117 HOUSE_OVERSIGHT_016920
might sound like a risk, human understanding comes with its own limits; engineering design is littered with what appeared to be good insights that have had bad consequences. Declarative design rests on all the advances in AI, plus the improving fidelity of simulations to virtually test designs. The mother of all design problems is the one that resulted in us. The way we’re designed resides in one of the oldest and most conserved parts of the genome, called the Hox genes. These are genes that regulate genes, in what are called developmental programs. Nothing in your genome stores the design of your body; your genome stores, rather, a series of steps to follow that results in your body. This is an exact parallel to how search is done in AI. There are too many possible body plans to search over, and most modifications would be either inconsequential or fatal. The Hox genes are a representation of a productive place for evolutionary search. It’s a kind of natural intelligence at the molecular level. AI has a mind-body problem, in that it has no body. Most work on AI is done in the cloud, running on virtual machines in computer centers where data are funneled. Our own intelligence is the result of a search algorithm (evolution) that was able to change our physical form as well as our programming—those are inextricably linked. Ifthe history of AI can be understood as the working of scaling laws rather than a succession of fashions, then its future can be seen in the same way. What’s now being digitized, after communication and computation, is fabrication, bringing the programmability of bits to the world of atoms. By digitizing not just designs but the construction of materials, the same lessons that von Neumann and Shannon taught us apply to exponentially increasing fabricational complexity. I’ve defined digital materials to be those constructed from a discrete set of parts reversibly joined with a discrete set of relative positions and orientations. These attributes allow the global geometry to be determined from local constraints, assembly errors to be detected and corrected, heterogeneous materials to be joined, and structures to be disassembled rather than disposed of when they’re no longer needed. The amino acids that are the foundation of life and the Lego bricks that are the foundation of play share these properties. What’s interesting about amino acids is that they’re not interesting. They have attributes that are typical but not unusual, such as attracting or repelling water. But just twenty types of them are enough to make you. In the same way, twenty or so types of digital-material part types—conducting, insulating, rigid, flexible, magnetic, etc —are enough to assemble the range of functions that go into making modern technologies like robots and computers. The connection between computation and fabrication was foreshadowed by the very pioneers whose work the edifice of computing is based on. Wiener hinted at this by linking material transportation with message transportation. John von Neumann is credited with modern computer architecture, something he actually wrote very little about; the final thing he studied, and wrote about beautifully and at length, was self- reproducing systems. As an abstraction of life, he modeled a machine that can communicate a computation that constructs itself. And the final thing Alan Turing, who is credited with the theoretical framework for computer science, studied was how the instructions in genes can give rise to physical forms. These questions address a topic absent from a typical computer-science education: the physical configuration of a 118 HOUSE_OVERSIGHT_016921
computation. Von Neumann and Turing posed their questions as theoretical studies, because it was beyond the technology of their day to realize them. But with the convergence of communication and computation with fabrication, these investigations are now becoming accessible experimentally. Making an assembler that can assemble itself from the parts that it’s assembling is a focus of my lab, along with collaborations to develop synthetic cells. The prospect of physically self-reproducing automata is potentially much scarier than fears of out-of-control AI, because it moves the intelligence out here to where we live. It could be a roadmap leading to Terminator’s Skynet robotic overlords. But it’s also a more hopeful prospect, because an ability to program atoms as well as bits enables designs to be shared globally while locally producing things like energy, food, and shelter—all of these are emerging as exciting early applications of digital fabrication. Wiener worried about the future of work, but he didn’t question implicit assumptions about the nature of work which are challenged when consumption can be replaced by creation. History suggests that neither utopian nor dystopian scenarios prevail; we generally end up muddling along somewhere in between. But history also suggests that we don’t have to wait on history. Gordon Moore in 1965 was able to use five years of the doubling of the specifications of integrated circuits to project what turned out to be fifty years of exponential improvements in digital technologies. We’ve spent many of those years responding to, rather than anticipating, its implications. We have more data available now than Gordon Moore did to project fifty years of doubling the performance of digital fabrication. With the benefit of hindsight, it should be possible to avoid the excesses of digital computing and communications this time around, and, from the outset, address issues like access and literacy. If the maker movement is the harbinger of a third digital revolution, the success of AI in meeting many of its own early goals can be seen as the crowning achievement of the first two digital revolutions. Although machine making and machine thinking might appear to be unrelated trends, they lie in each other’s futures. The same scaling trends that have made AI possible suggest that the current mania is a phase that will pass, to be followed by something even more significant: the merging of artificial and natural intelligence. It was an advance for atoms to form molecules, molecules to form organelles, organelles to form cells, cells to form organs, organs to form organisms, organisms to form families, families to form societies, and societies to form civilizations. This grand evolutionary loop can now be closed, with atoms arranging bits arranging atoms. 119 HOUSE_OVERSIGHT_016922
While Danny Hillis was an undergraduate at MIT, he built a computer out of Tinkertoys. It has around 10,000 wooden parts, plays tic-tac-toe, and never loses; it’s now in the Computer History Museum, in Mountain View, California. As a graduate student at the MIT Computer Science and Artificial Intelligence Laboratory in the early 1980s, Danny designed a massively parallel computer with 64,000 processors. He named it the Connection Machine and founded what may have been the first Al company—Thinking Machines Corporation—to produce and market it. This was despite a lunch he had with Richard Feynman, at which the celebrated physicist remarked, “That is positively the dopiest idea I ever heard.” Maybe “despite” is the wrong word, since Feynman had a well-known predilection for playing with dopey ideas. In the event, he showed up on the day the company was incorporated and stayed on, for summer jobs and special assignments, to make invaluable contributions to its work. Danny has since established a number of technology companies, of which the latest is Applied Invention, which partners with commercial enterprises to develop technological solutions to their most intractable problems. He holds hundreds of U.S. patents, covering parallel computers, touch interfaces, disk arrays, forgery prevention methods, and a slew of electronic and mechanical devices. His imagination is apparently boundless, and here he sketches some possible scenarios that will result from our pursuit of a better and better Al. “Our thinking machines are more than metaphors,” he says. “The question is not, ‘Will they be powerful enough to hurt us?’ (they will), or whether they will always act in our best interests (they won't), but whether over the long term they can help us find our way—where we come out on the Panacea/Apocalypse continuum. ” 120 HOUSE_OVERSIGHT_016923
THE FIRST MACHINE INTELLIGENCES W. Daniel Hillis W. Daniel “Danny” Hillis is an inventor, entrepreneur, and computer scientist, Judge Widney Professor of Engineering and Medicine at USC, and author of The Pattern on the Stone: The Simple Ideas That Make Computers Work. I have spoken of machines, but not only of machines having brains of brass and thews of iron. When human atoms are knit into an organization in which they are used, not in their full nght as responsible human beings, but as cogs and levers and rods, it matters little that their raw material is flesh and blood. What is used as an element in a machine, is in fact an element in the machine. Whether we entrust our decisions to machines of metal, or to those machines of flesh and blood which are bureaus and vast laboratories and armies and corporations, we shall never receive the right answers to our questions unless we ask the right questions.... The hour is very late, and the choice of good and evil knocks at our door. —Norbert Wiener, Zhe Human Use of Human Beings Norbert Wiener was ahead of his time in recognizing the potential danger of emergent intelligent machines. I believe he was even further ahead in recognizing that the first artificial intelligences had already begun to emerge. He was correct in identifying the corporations and bureaus that he called “machines of flesh and blood” as the first intelligent machines. He anticipated the dangers of creating artificial superintelligences with goals not necessarily aligned with our own. What is now clear, whether or not it was apparent to Wiener, is that these organizational superintelligences are not just made of humans, they are hybrids of humans and the information technologies that allow them to coordinate. Even in Wiener’s time, the “bureaus and vast laboratories and armies and corporations” could not operate without telephones, telegraphs, radios, and tabulating machines. Today they could not operate without networks of computers, databases, and decision support systems. These hybrid intelligences are technologically augmented networks of humans. These artificial intelligences have superhuman powers. They can know more than individual humans; they can sense more; they can make more complicated analyses and more complex plans. They can have vastly more resources and power than any single individual. Although we do not always perceive it, hybrid superintelligences such as nation states and corporations have their own emergent goals. Although they are built by and for humans, they often act like independent intelligent entities, and their actions are not always aligned to the interests of the people who created them. The state 1s not always for the citizen, nor the company for the shareholder. Nor do not-for-profits, religious orders, or political parties always act in furtherance of their founding principles. Intuitively, we recognize that their actions are guided by internal goals, which is why we personify them, both legally and in our habits of thought. When talking about “what China wants,” or “what General Motors is trying to do,” we are not speaking in metaphors. These organizations act as intelligences that perceive, decide, and act. Like the goals of individual humans, the goals of organizations are complex and often self- contradictory, but they are true goals in the sense that they direct action. Those goals 121 HOUSE_OVERSIGHT_016924
depend somewhat on the goals of the people within the organization, but they are not identical. Any American knows how loose the tie is between the actions of the U.S. government and the diverse and often contradictory aims of its citizens. That is also true of corporations. For-profit corporations nominally serve multiple constituencies, including shareholders, senior executives, employees, and customers. These corporations differ in how they balance their loyalties and often behave in ways that serve none of their constituents. The “neurons” that carry their corporate thought are not just the human employees or the technologies that connect them; they are also coded into the policies, incentive structures, culture, and procedural habits of the corporation. The emergent corporate goals do not always reflect the values of the people who implement them. For instance, an oil company led and staffed by people who care about the environment may have incentive structures or policies that cause it to compromise environmental safety for the sake of corporate earnings. The components’ good intentions are not a guarantee of the emergent system’s good behavior. Governments and corporations, both built partly of humans, are naturally motivated to at least appear to share the goals of the humans they depend upon. They could not function without the people, so they need to keep them cooperative. When such organizations appear to behave altruistically, this is often part of their motive. I once complimented the CEO of a large corporation on the contribution his company made toward a humanitarian relief effort. The CEO responded, without a trace of irony, “Yes. We have decided to do more things like that to make our brand more likeable.” Individuals who compose a hybrid superintelligence may occasionally exert a “humanizing” influence—for example, an employee may break company policies to accommodate the needs of another human. The employee may act out of true human empathy, but we should not attribute any such empathy to the superintelligence itself. These hybrid machines have goals, and their citizens/customers/employees are some of the resources they use to accomplish them. We are close to being able to build superintelligences out of pure information technology, without human components. This is what people normally refer to as “artificial intelligence,” or AI. It is reasonable to ask what the attitudes of the hypothetical machine superintelligences will be toward humans. Will they, too, see humans as useful resources and a good relationship with us as worth preserving? Will they be constructed to have goals that are aligned with our own? Will a superintelligence even see these questions as important? What are the “right questions” that we should be asking? I believe that one of the most important 1s this: What relationship will various superintelligences have to one another? It is interesting to consider how the hybrid superintelligences currently deal with conflicts among themselves. Today, much of the ultimate power rests in the nation states, which claim authority over a patch of ground. Whether they are optimized to act in the interests of their citizens or those of a despotic ruler, nation states assert priority over other intelligences’ desires or goals within their geographic dominion. They claim a monopoly on the use of force and recognize only other nation states as peers. They are willing, if necessary, to demand great sacrifices of their citizens to enforce their authority, even to the point of sacrificing their citizens’ lives. 122 HOUSE_OVERSIGHT_016925
This geographical division of authority made logical sense when most of the actors were humans who spent their lives within a single nation state, but now that the actors of importance include geographically distributed hybrid intelligences such as multinational corporations, that logic is less obvious. Today we live in a complex transitional period, when distributed superintelligences still largely rely on the nation states to settle the arguments arising among them. Often, those arguments are resolved differently in different jurisdictions. It is becoming more difficult even to assign individual humans to nation states: International travelers living and working outside their native country, refugees, and immigrants (documented and not) are still dealt with as awkward exceptions. Superintelligences built purely of information technology will prove even more awkward for the territorial system of authority, since there is no reason why they need to be tied to physical resources in a single country—or even to any particular physical resources at all. An artificial intelligence might well exist “in the cloud” rather than at any physical location. I can imagine at least four scenarios for how machine superintelligences will relate to hybrid superintelligences. In one obvious scenario, multiple machine intelligences will ultimately be controlled by, and allied with, individual nation states. In this state/AI scenario, one can envision American and Chinese super-AIs wrestling each other for resources on behalf of their state. In some sense, these AIs would be citizens of their nation state in the way that many commercial corporations often act as “corporate citizens” today. In this scenario, the host nation states would presumably give the machine superintelligences the resources they needed to work for the state’s advantage. Or, to the degree that the superintelligences can influence their state governments, they will presumably do so to enhance their own power, for instance by garnering a larger share of the state’s resources. Nation states’ Als might not want competing Als to grow up within their jurisdiction. In this scenario, the superintelligences become an extension of the state, and vice versa. The state/AI scenario seems plausible, but it is not our current course. Our most powerful and rapidly improving artificial intelligences are controlled by for-profit corporations. This is the corporate/AI scenario, in which the balance of power between nation states and corporations becomes inverted. Today, the most powerful and intelligent collections of machines are probably owned by Google, but companies like Amazon, Baidu, Microsoft, Facebook, Apple, and IBM may not be far behind. These companies all see a business imperative to build artificial intelligences of their own. It is easy to imagine a future in which corporations independently build their own machine intelligences, protected within firewalls preventing the machines from taking advantage of one another’s knowledge. These machines will be designed to have goals aligned with those of the corporation. If this alignment is effective, nation states may continue to lag behind in developing their own artificial-intelligence capability and instead depend on their “corporate citizens” to do it for them. To the extent that corporations successfully control the goals, they will become more powerful and autonomous than nation states. Another scenario, perhaps the one people fear the most, is that artificial intelligences will not be aligned with either humans or hybrid superintelligences but will act solely in their own interest. They might even merge into a single machine superintelligence, since there may be no technical requirement for machine intelligences to maintain distinct identities. The attitude of a self-interested super-AI toward hybrid 123 HOUSE_OVERSIGHT_016926
superintelligences is likely to be competitive. Humans might be seen as minor annoyances, like ants at a picnic, but hybrid superintelligences—like corporations, organized religions, and nation states—could be existential threats. Like hybrid superintelligences, Als might see humans mostly as useful tools to accomplish their goals, as pawns in their competition with the other superintelligences. Or we might simply be irrelevant. It is not impossible that a machine intelligence has already emerged and we simply do not recognize it as such. It may not wish to be noticed, or it may be so alien to us that we are incapable of perceiving it. This makes the self-interested AI scenario the most difficult to imagine. I believe the easy-to-imagine versions, like the humanoid intelligent robots of science fiction, are the least likely. Our most complex machines, like the Internet, have already grown beyond the detailed understanding of a single human, and their emergent behaviors may be well beyond our ken. The final scenario is that machine intelligences will not be allied with one another but instead will work to further the goals of humanity as a whole. In this optimistic scenario, AI could help us restore the balance of power between the individual and the corporation, between the citizen and the state. It could help us solve the problems that have been created by hybrid superintelligences that subvert the goals of humans. In this scenario, AIs will empower us by giving us access to processing capacity and knowledge currently available only to corporations and states. In effect, they could become extensions of our own individual intelligences, in furtherance of our human goals. They could make our weak individual intelligences strong. This prospect is both exciting and plausible. It is plausible because we have some choice in what we build, and we have a history of using technology to expand and augment our human capacities. As airplanes have given us wings and engines have given us muscles to move mountains, so our network of computers may amplify and extend our minds. We may not fully understand or control our destiny, but we have a chance to bend it in the direction of our values. The future is not something that will happen to us; it is something that we will build. Why Wiener Saw What Others Missed There is in electrical engineering a split which 1s known in Germany as the split between the technique of strong currents and the technique of weak currents, and which we know as the distinction between power and communication engineering. It is this split which separates the age just past from that in which we are now living. —Norbert Wiener, Cybernetics, or Control and Communication in the Animal and the Machine Cybernetics is the study of the how the weak can control the strong. Consider the defining metaphor of the field: the helmsman guiding a ship with a tiller. The helmsman’s goal is to control the heading of the ship, to keep it on the right course. The information, the message that is sent to the helmsman, comes from the compass or the stars, and the helmsman closes the feedback loop by sending the steering messages through the gentle force of his hand on the tiller. In this picture, we see the ship tossing in powerful wind and waves in the real world, controlled by the communication system of messages in the world of information. Yet the distinction between “real” and “information” is mostly a difference in perspective. The signals that carry messages, like the light of the stars and pressure of the 124 HOUSE_OVERSIGHT_016927
hand on the tiller, exist in a world of energy and forces, as does the helmsman. The weak forces that control the rudder are as real and physical as the strong forces that toss the ship. If we shift our cybernetics perspective from the ship to the helmsman, the pressures on the rudder become a strong force of muscles controlled by the weak signals in the mind of the helmsman. These messages in the helmsman’s mind are amplified into a physical force strong enough to steer the ship. Or instead, we can zoom out and take a large cybernetics perspective. We might see the ship itself as part of a vast trade network, part of a feedback loop that regulates the price of commodities through the flow of goods. In this perspective, the tiny ship is merely a messenger. So, the distinction between the physical world and the information world is a way to describe the relationship between the weak and the strong. Wiener chose to view the world from the vantage point and scale of the individual human. As acyberneticist, he took the perspective of the weak protagonist embedded within a strong system, trying to make the best of limited powers. He incorporated this perspective in his very definition of information. “Information,” he said, “is a name for the content of what is exchanged with the outer world as we adjust to it, and make our adjustment felt upon it.” In his words, information is what we use to “live effectively within that environment.” *> For Wiener, information is a way for the weak to effectively cope with the strong. This viewpoint is also reflected in Gregory Bateson’s definition of information as “a difference that makes a difference,” by which he meant the small difference that makes a big difference. The goal of cybernetics was to create a tiny model of the system using “weak currents” to amplify and control “strong currents” of the real world. The central insight was that a control problem could be solved by building an analogous system in the information space of messages and then amplifying solutions into the larger world of reality. Inherent in the motion of a control system is the concept of amplification, which makes the small big and the weak strong. Amplification allows the difference that makes a difference to make a difference. In this way of looking at the world, a control system needed to be as complex as the system it controlled. Cyberneticist W. Ross Ashby proved that this was true in a precise mathematical sense, in what is now called Ashby’s Law of Requisite Variety, or sometimes the First Law of Cybernetics. The law tells us that to control a system completely, the controller must be as complex as the controlled. Thus cyberneticists tended to see control systems as a kind of analog of the systems they governed, like the homunculus—the hypothetical little person inside the brain who controls the actual person. This notion of analogous structure is sometimes confused with the notion of analog encoding of messages, but the two are logically distinct. Norbert Wiener was much impressed with Vannevar Bush’s Digital Differential Analyzer, which could be reconfigured to match the structure of whatever problem it was given to solve but used digital signal encoding. Signals could be simplified to openly represent the relevant distinctions, allowing them to be more accurately communicated and stored. In digital signals, one needed only to preserve the difference in signals that made a difference. It is this distinction and signal coding that we commonly use to distinguish “analog” versus “digital.” Digital signal encoding was entirely compatible with cybernetic thinking—in 33 The Human Use of Human Beings (Boston: Houghton Mifflin, 1954), p. 17-18. 125 HOUSE_OVERSIGHT_016928
fact, enabling to it. What was constraining to cybernetics was the presumption of an analogy of structure between the controller and the controlled. By the 1930s, Kurt Gédel, Alonzo Church, and Alan Turing had all described universal systems of computation, in which the computation required no structural analogy to functions that were computed. These universal computers could also compute the functions of control. The analogy of structure between the controller and the controlled was central to the cybernetic perspective. Just as digital coding collapses the space of possible messages into a simplified version that represents only the difference that makes a difference, so the control system collapses the state space of a controlled system into a simplified model that reflects only the goals of the controller. Ashby’s Law does not imply that every controller must model every state of the system but only those states that matter for advancing the controller’s goals. Thus, in cybernetics, the goal of the controller becomes the perspective from which the world 1s viewed. Norbert Wiener adopted the perspective of the individual human relating to vast organizations and trying to “live effectively within that environment.” He took the perspective of the weak trying to influence the strong. Perhaps this is why he was able to notice the emergent goals of the “machines of flesh and blood” and anticipate some of the human challenges posed by these new intelligences, hybrid machine intelligences with goals of their own. 126 HOUSE_OVERSIGHT_016929
Venki Ramakrishnan is a Nobel Prize-winning biologist whose many scientific contributions include his work on the atomic structure of the ribosome—in effect, a huge molecular machine that reads our genes and makes proteins. His work would have been impossible without powerful computers. The Internet made his own work a lot easier and, he notes, acted as a leveler internationally: “When I grew up in India, if you wanted to get a book, it would show up six months or a year after it had already come out in the West. ... Journals would arrive by surface mail a few months later. I didn’t have to deal with it, because I left India when I was nineteen, but I know Indian scientists had to deal with it. Today they have access to information at the click of a button. More important, they have access to lectures. They can listen to Richard Feynman. That would have been a dream of mine when I was growing up. They can just watch Richard Feynman on the Web. That’s a big leveling in the field.” And yet... “Along with the benefits [of the Web], there is now a huge amount of noise. You have all of these people spouting pseudoscientific jargon and pushing their own ideas as if they were science.” As president of the Royal Society, Venki worries, too, about the broader issue of trust: public trust in evidence-based scientific findings, but also trust among scientists, bolstered by rigorous checking of one another ’s conclusions—trust that is in danger of eroding because of the “black box” character of deep-learning computers. “This [erosion] is going to happen more and more, as data sets get bigger, as we have genome- wide studies, population studies, and all sorts of things,” he says. “How do we, as a science community, grapple with this and communicate to the public a sense of what science is about, what is reliable in science, what is uncertain in science, and what is just plain wrong in science?” 127 HOUSE_OVERSIGHT_016930
WILL COMPUTERS BECOME OUR OVERLORDS? Venki Ramakrishnan Venki Ramakrishnan is a scientist at the Medical Research Council Laboratory of Molecular Biology, Cambridge University; recipient of the Nobel Prize in Chemistry (2009); current president of the Royal Society; and the author of Gene Machine: The Race to Discover the Secrets of the Ribosome. A former colleague of mine, Gérard Bricogne, used to joke that carbon-based intelligence was simply a catalyst for the evolution of silicon-based intelligence. For quite a long time, both Hollywood movies and scientific Jeremiahs have been predicting our eventual capitulation to our computer overlords. We all await the singularity, which always seems to be just over the horizon. In a sense, computers have already taken over, facilitating virtually every aspect of our lives—from banking, travel, and utilities to the most intimate personal communication. I can see and talk to my grandson in New York for free. I remember when I first saw the 1968 movie 200/: A Space Odyssey, the audience laughed at the absurdly cheap cost of a picturephone call from space: $1.70, at a time when a long- distance call within the U.S. was $3 per minute. However, the convenience and power of computers is also something of a Faustian bargain, for it comes with a loss of control. Computers prevent us from doing things we want. Try getting on a flight if you arrive at the airport and the airline computer systems are down, as happened not so long ago to British Airways at Heathrow. The planes, pilots, and passengers were all there; even the air-traffic controls were working. But no flights for that airline were allowed to take off. Computers also make us do things we don’t want—by generating mailing lists and print labels to send us all millions of pieces of unwanted mail, which we humans have to sort, deliver, and dispose of. But you ain’t seen nothing yet. In the past, we programmed computers using algorithms we understood at least in principle. So when machines did amazing things like beating world chess champion Garry Kasparov, we could say that the victorious programs were designed with algorithms based on our own understanding—using, in this instance, the experience and advice of top grandmasters. Machines were simply faster at doing brute-force calculations, had prodigious amounts of memory, and were not prone to errors. One article described Deep Blue’s victory not as that of a computer, which was just a dumb machine, but as the victory of hundreds of programmers over Kasparov, a single individual. That way of programming is changing dramatically. After a long hiatus, the power of machine learning has taken off. Much of the change came when programmers, rather than trying to anticipate and code for every possible contingency, allowed computers to train themselves on data, using deep neural networks based on models of how our own brains learn. They use probabilistic methods to “learn” from large quantities of data; computers can recognize patterns and come up with conclusions on their own. A particularly powerful method is called reinforcement learning, by which the computer learns, without prior input, which variables are important and how much to 128 HOUSE_OVERSIGHT_016931
weight them to reach a certain goal. This method in some sense mimics how we learn as children. The results from these new approaches are amazing. Such a deep-learning program was used to teach a computer to play Go, a game that only a few years ago was thought to be beyond the reach of AI because it was so hard to calculate how well you were doing. It seemed that top Go players relied a great deal on intuition and a feel for position, so proficiency was thought to require a particularly human kind of intelligence. But the AlphaGo program produced by DeepMind, after being trained on thousands of high-level Go games played by humans and then millions of games with itself, was able to beat the top human players in short order. Even more amazingly, the related AlphaGo Zero program, which learned from scratch by playing itself, was stronger than the version trained initially on human games! It was as though the humans had been preventing the computer from reaching its true potential. The same method has recently been generalized: Starting from scratch, within just twenty-four hours, an equivalent AlphaZero chess program was able to beat today’s top “conventional” chess programs, which in turn have beaten the best humans. Progress has not been restricted to games. Computers are significantly better at image and voice recognition and speech synthesis than they used to be. They can detect tumors in radiographs earlier than most humans. Medical diagnostics and personalized medicine will improve substantially. Transportation by self-driving cars will keep us all safer, on average. My grandson may never have to acquire a driver’s license, because driving a car will be like riding a horse today—a hobby for the few. Dangerous activities, such as mining, and tedious repetitive work will be done by computers. Governments will offer better targeted, more personalized and efficient public services. AI could revolutionize education by analyzing an individual pupil’s needs and enabling customized teaching, so that each student can advance at an optimal rate. Along with these huge benefits, of course, will come alarming risks. With the vast amounts of personal data, computers will learn more about us than we may know about ourselves; the question of who owns data about us will be paramount. Moreover, data-based decisions will undoubtedly reflect social biases: Even an allegedly neutral intelligent system designed to predict loan risks, say, may conclude that mere membership in a particular minority group makes you more likely to default on a loan. While this is an obvious example that we could correct, the real danger is that we are not always aware of biases in the data and may simply perpetuate them. Machine learning may also perpetuate our own biases. When Netflix or Amazon tries to tell you what you might want to watch or buy, this is an application of machine learning. Currently such suggestions are sometimes laughable, but with time and more data they will get increasingly accurate, reinforcing our prejudices and likes and dislikes. Will we miss out on the random encounter that might persuade us to change our views by exposing us to new and conflicting ideas? Social media, given its influence on elections, is a particularly striking illustration of how the divide between people on different sides of the political spectrum can be accentuated. We may have already reached the stage where most governments are powerless to resist the combined clout of a few powerful multinational companies that control us and our digital future. The fight between dominant companies today is really a fight for control over our data. They will use their enormous influence to prevent regulation of data, because their interests lie in unfettered control of it. Moreover, they have the 129 HOUSE_OVERSIGHT_016932
financial resources to hire the most talented workers in the field, enhancing their power even further. We have been giving away valuable data for the sake of freebies like Gmail and Facebook, but as the journalist and author John Lanchester has pointed out in the London Review of Books, if it is free, then you are the product. Their real customers are the ones who pay them for access to knowledge about us, so that they can persuade us to buy their products or otherwise influence us. One way around the monopolistic control of data is to split the ownership of data away from firms that use them. Individuals would instead own and control access to their personal data (a model that would encourage competition, since people would be free to move their data to a company that offered better services). Finally, abuse of data is not limited to corporations: In totalitarian states, or even nominally democratic ones, governments know things about their citizens that Orwell could not have imagined. The use they make of this information may not always be transparent or possible to counter. The prospect of AI for military purposes is frightening. One can imagine intelligent systems being designed to act autonomously based on real-time data and able to act faster than the enemy, starting catastrophic wars. Such wars may not necessarily be conventional or even nuclear wars. Given how essential computer networks are to modern society, it is much more likely that AI wars will be fought in cyberspace. The consequences could be just as dire. Despite this loss of control, we continue to march inexorably into a world in which AI will be everywhere: Individuals won’t be able to resist its convenience and power, and corporations and governments won’t be able to resist its competitive advantages. But important questions arise about the future of work. Computers have been responsible for considerable losses in blue-collar jobs in the last few decades, but until recently many white-collar jobs—jobs that “only humans can do”—were thought to be safe. Suddenly that no longer appears to be true. Accountants, many legal and medical professionals, financial analysts and stockbrokers, travel agents—in fact, a large fraction of white-collar jobs—will disappear as a result of sophisticated machine-learning programs. We face a future in which factories churn out goods with very few employees and the movement of goods is largely automated, as are many services. What’s left for humans to do? In 1930—long before the advent of computers, let alone AI—John Maynard Keynes wrote, in an essay called “Economic Possibilities for our Grandchildren,” that as a result of improvements in productivity, society could produce all its needs with a fifteen-hour work week. He also predicted, along with the growth of creative leisure, the end of money and wealth as a goal: We shall be able to afford to dare to assess the money-motive at its true value. The love of money as a possession—as distinguished from the love of money as a means to the enjoyments and realities of life—will be recognised for what it is, a somewhat disgusting morbidity, one of those semi-criminal, semi-pathological propensities which one hands over with a shudder to the specialists in mental disease. 130 HOUSE_OVERSIGHT_016933
Sadly, Keynes’s predictions did not come true. Although productivity did indeed increase, the system—possibly inherent in a market economy—did not result in humans working much shorter hours. Rather, what happened is what the anthropologist and anarchist David Graeber describes as the growth of “bullshit jobs.”°4 While jobs that produce essentials like food, shelter, and goods have been largely automated away, we have seen an enormous expansion of sectors like corporate law, academic and health administration (as opposed to actual teaching, research, and the practice of medicine), “human resources,” and public relations, not to mention new industries like financial services and telemarketing and ancillary industries in the so-called gig economy which serve those who are too busy doing all that additional work. How will societies cope with technology’s increasingly rapid destruction of entire professions and throwing large numbers of people out of work? Some argue that this concern is based on a false premise, because new jobs spring up that didn’t exist before, but as Graeber points out, these new jobs won’t necessarily be rewarding or fulfilling. During the first industrial revolution, it took almost a century before most people were better off. That revolution was possible only because the government of the time ruthlessly favored property rights over labor, and most people (and all women) did not have the vote. In today’s democratic societies, it is not clear that the population will tolerate such a dramatic upheaval of society based on the promise that “eventually” things will get better. Even that rosy vision will depend on a radical shake-up of education and lifelong learning. The Industrial Revolution did trigger enormous social change of this kind, including a shift to universal education. But it will not happen unless we make it happen: This is essentially about power, agency, and control. What’s next for, say, the forty-year- old taxi driver or truck driver in an era of autonomous vehicles? One idea that has been touted is that of a universal basic income, which will allow citizens to pursue their interests, retrain for new occupations, and generally be free to live a decent life. However, market economies, which are predicated on growing consumer demand over all else, may not tolerate this innovation. There is also a feeling among many that meaningful work is essential to human dignity and fulfillment. So another possibility is that the enormous wealth generated by increased productivity due to automation could be redistributed to jobs requiring human labor and creativity in fields such as the arts, music, social work, and other worthwhile pursuits. Ultimately, which jobs are rewarding or productive and which are “bullshit” is a matter of judgment and may vary from society to society, as well as over time. So far, ’ve focused on AI’s practical consequences. As a scientist, what bothers me is our potential loss of understanding. We are now accumulating data at an incredible rate. In my own lab, an experiment generates over a terabyte of data a day. These data are massaged, analyzed, and reduced until there is an interpretable result. But in all of this data analysis, we believe we know what’s happening. We know what the programs are 34 hitps://strikemag. org/bullshit-jobs/ 131 HOUSE_OVERSIGHT_016934
doing because we designed the algorithms at their heart. So when our computers generate a result, we feel that we intellectually grasp it. The new machine-learning programs are different. Having recognized patterns via deep neural networks, they come up with conclusions, and we have no idea exactly how. When they uncover relationships, we don’t understand it in the same way as if we had deduced those relationships ourselves using an underlying theoretical framework. As data sets become larger, we won’t be able to analyze them ourselves even with the help of computers; rather, we will rely entirely on computers to do the analysis for us. So if someone asks us how we know something, we will simply say it is because the machine analyzed the data and produced the conclusion. One day a computer may well come up with an entirely new result—e.g., a mathematical theorem whose proof, or even whose statement, no human can understand. That is philosophically different from the way we have been doing science. Or at least thought we had; some might argue that we don’t know how our own brains reach conclusions either, and that these new methods are a way of mimicking learning by the human brain. Nevertheless, I find this potential loss of understanding disturbing. Despite the remarkable advances in computing, the hype about AGI—a general- intelligence machine that will think like a human and possibly develop consciousness— smacks of science fiction to me, partly because we don’t understand the brain at that level of detail. Not only do we not understand what consciousness is, we don’t even understand a relatively simple problem like how we remember a phone number. In just that one question, there are all sorts of things to consider. How do we know it is a number? How do we associate it with a person, a name, face, and other characteristics? Even such seemingly trivial questions involve everything from high-level cognition and memory to how a cell stores information and how neurons interact. Moreover, that’s just one task among many that the brain does effortlessly. Whereas machines will no doubt do ever more amazing things, they’re unlikely to be a replacement for human thought and human creativity and vision. Eric Schmidt, former chairman of Google’s parent company, said in a recent interview at the London Science Museum that even designing a robot that would clear the table, wash the dishes, and put them away was a huge challenge. The calculations involved in figuring out all the movements the body has to make to throw a ball accurately or do slalom skiing are prodigious. The brain can do all these and also do mathematics and music, and invent games like chess and Go, not just play them. We tend to underestimate the complexity and creativity of the human brain and how amazingly general it is. If AI is to become more humanlike in its abilities, the machine-learning and neuroscience communities need to interact closely, something that is happening already. Some of today’s greatest exponents of machine learning—such as Geoffrey Hinton, Zoubin Ghahramani, and Demis Hassabis—have backgrounds in cognitive neuroscience, and their success has been at least in part due to attempts to model brainlike behavior in their algorithms. At the same time, neurobiology has also flourished. All sorts of tools have been developed to watch which neurons are firing and genetically manipulate them and see what’s happening in real time with inputs. Several countries have launched moon-shot neuroscience initiatives to see if we can crack the workings of the brain. Advances in AI and neuroscience seem to go hand in hand; each field can propel the other. 132 HOUSE_OVERSIGHT_016935
Many evolutionary scientists, and such philosophers as Daniel Dennett, have pointed out that the human brain is the result of billions of years of evolution.*°> Human intelligence is not the special characteristic we think it is, but just another survival mechanism not unlike our digestive or immune systems, both of which are also amazingly complex. Intelligence evolved because it allowed us to make sense of the world around us, to plan ahead, and thus cope with all sorts of unexpected things in order to survive. However, as Descartes stated, we humans define our very existence by our ability to think. So it is not surprising that, in an anthropomorphic way, our fears about AI reflect this belief that our intelligence is what makes us special. But if we step back and look at life on Earth, we see that we are far from the most resilient species. If we’re going to be taken over at some point, it will be by some of Earth’s oldest life-forms, like bacteria, which can live anywhere from Antarctica to deep- sea thermal vents hotter than boiling water, or in acid environments that would melt you and me. So when people ask where we’re headed, we need to put the question in a broader context. I don’t know what sort of future AI will bring: whether AI will make humans subservient or obsolete or will be a useful and welcome enhancement of our abilities which will enrich our lives. But I am reasonably certain that computers will never be the overlords of bacteria. 35 See, for example, Dennett’s From Bacteria to Bach and Back: The Evolution of Minds (New York: W. W. Norton, 2017). 133 HOUSE_OVERSIGHT_016936
Alex “Sandy” Pentland, an exponent of what he has termed “social physics,” is interested in building powerful human-Al ecologies. He is concerned at the same time about the potential dangers of decision-making systems in which the data in effect take over and human creativity is relegated to the background. The advent of Big Data, he believes, has given us the opportunity to reinvent our civilization: “We can now begin to actually look at the details of social interaction and how those play out, and we re no longer limited to averages like market indices or election results. This is an astounding change. The ability to see the details of the market, of political revolutions, and to be able to predict and control them is definitely a case of Promethean fire—it could be used for good or for ill. Big Data brings us to interesting times.” At our group meeting in Washington, Connecticut, he confessed that reading Norbert Wiener on the concept of feedback “felt like reading my own thoughts.” “After Wiener, people discovered or focused on the fact that there are genuinely chaotic systems that are just not predictable,” he said, “but if you look at human socioeconomic systems, there is a large percentage of variance you can account for and predict. ... Today there is data from all sorts of digital devices, and from all of our transactions. The fact that everything is datafied means you can measure things in real time in most aspects of human life—and increasingly in every aspect of human life. The fact that we have interesting computers and machine-learning techniques means that you can build predictive models of human systems in ways you could never do before.” 134 HOUSE_OVERSIGHT_016937
THE HUMAN STRATEGY Alex “Sandy” Pentland Alex “Sandy” Pentland is Toshiba Professor and professor of media arts and sciences, MIT; director of the Human Dynamics and Connection Science labs and the Media Lab Entrepreneurship Program, and the author of Social Physics. In the last half-century, the idea of AI and intelligent robots has dominated thinking about the relationship between humans and computers. In part, this is because it’s easy to tell the stories about AI and robots, and in part because of early successes (e.g., theorem provers that reproduced most of Whitehead and Russell’s Principia Mathematica) and massive military funding. The earlier and broader vision of cybernetics, which considered the artificial as part of larger systems of feedback and mutual influence, faded from public awareness. However, in the intervening years the cybernetics vision has slowly grown and quietly taken over—to the point where it is “in the air.” State-of-the-art research in most engineering disciplines is now framed as feedback systems that are dynamic and driven by energy flows. Even AI is being recast as human/machine “advisor” systems, and the military is beginning large-scale funding in this area—something that should perhaps worry us more than drones and independent humanoid robots. But as science and engineering have adopted a more cybernetics-like stance, it has become clear that even the vision of cybernetics is far too small. It was originally centered on the embeddedness of the individual actor but not on the emergent properties of a network of actors. This is unsurprising, because the mathematics of networks did not exist until recently, so a quantitative science of how networks behave was impossible. We now know that study of the individual does not produce understanding of the system except in certain simple cases. Recent progress in this area was foreshadowed by understanding that “chaos,” and later “complexity,” were the typical behavior of systems, but we can now go far beyond these statistical understandings. We’re beginning to be able to analyze, predict, and even design the emergent behavior of complex heterogeneous networks. The cybernetics view of the connected individual actor can now be expanded to cover complex systems of connected individuals and machines, and the insights we obtain from this broader view are fundamentally different from those obtained from the cybernetics view. Thinking about the network is analogous to thinking about entire ecosystems. How would you guide ecosystems to grow in a good direction? What do you even mean by “a good direction”? Questions like this are beyond the boundary of traditional cybernetic thinking. Perhaps the most stunning realization is that humans are already beginning to use AI and machine learning to guide entire ecosystems, including ecosystems of people, thus creating human-AI ecologies. Now that everything is becoming “datafied,” we can measure most aspects of human life and, increasingly, aspects of all life. This, together with new, powerful machine-learning techniques, means that we can build models of these ecologies in ways we couldn’t before. Well-known examples are weather- and traffic-prediction models, which are being extended to predict the global climate and plan city growth and renewal. AlI-aided engineering of the ecologies is already here. 135 HOUSE_OVERSIGHT_016938
Development of human-AI ecosystems is perhaps inevitable for a social species such as ourselves. We became social early in our evolution, millions of years ago. We began exchanging information with one another to stay alive, to increase our fitness. We developed writing to share abstract and complex ideas, and most recently we’ve developed computers to enhance our communication abilities. Now we’re developing AI and machine-learning models of ecosystems and sharing the predictions of those models to jointly shape our world through new laws and international agreements. We live in an unprecedented historic moment, in which the availability of vast amounts of human behavioral data and advances in machine learning enable us to tackle complex social problems through algorithmic decision making. The opportunities for such a human-AI ecology to have positive social impact through fairer and more transparent decisions are obvious. But there are also risks of a “tyranny of algorithms,” where unelected data experts are running the world. The choices we make now are perhaps even more momentous than those we faced in the 1950s, when AI and cybernetics were created. The issues look similar, but they’re not. We have moved down the road, and now the scope is larger. It’s not just AI robots versus individuals. It’s AI guiding entire ecologies. How can we make a good human-artificial ecosystem, something that’s not a machine society but a cyberculture in which we can all live as homans—a culture with a human feel to it? We don’t want to think small—for example, to talk only of robots and self- driving cars. We want this to be a global ecology. Think Skynet-size. But how would you make Skynet something that’s about the human fabric? The first thing to ask is: What’s the magic that makes the current AI work? Where is it wrong and where 1s it right? The good magic is that it has something called the credit-assignment function. What that lets you do is take “stupid neurons’”—little linear functions—and figure out, in a big network, which ones are doing the work and strengthen them. It’s a way of taking a random bunch of switches all hooked together in a network and making them smart by giving them feedback about what works and what doesn’t. This sounds simple, but there’s some complicated math around it. That’s the magic that makes current AI work. The bad part of it is, because those little neurons are stupid, the things they learn don’t generalize very well. If an AI sees something it hasn’t seen before, or if the world changes a little bit, the AI is likely to make a horrible mistake. It has absolutely no sense of context. In some ways, it’s as far from Norbert Wiener’s original notion of cybernetics as you can get, because it isn’t contextualized; it’s a little idiot savant. But imagine that you took away those limitations: Imagine that instead of using dumb neurons, you used neurons in which real-world knowledge was embedded. Maybe instead of linear neurons, you used neurons that were functions in physics, and then you tried to fit physics data. Or maybe you put in a lot of knowledge about humans and how they interact with one another—the statistics and characteristics of humans. When you add this background knowledge and surround it with a good credit- assignment function, then you can take observational data and use the credit-assignment function to reinforce the functions that are producing good answers. The result is an AI that works extremely well and can generalize. For instance, in solving physical 136 HOUSE_OVERSIGHT_016939
problems, it often takes only a couple of noisy data points to get something that’s a beautiful description of a phenomenon, because you’re putting in knowledge about how physics works. That’s in huge contrast to normal AI, which requires millions of training examples and is very sensitive to noise. By adding the appropriate background knowledge, you get much more intelligence. Similar to the physical-systems case, if we make neurons that know a lot about how humans learn from each other, then we can detect human fads and predict human behavior trends in surprisingly accurate and efficient ways. This “social physics” works because human behavior is determined as much by the patterns of our culture as by rational, individual thinking. These patterns can be described mathematically and employed to make accurate predictions. This idea of a credit-assignment function reinforcing connections between neurons that are doing the best work is the core of current AI. If you make those little neurons smarter, the AI gets smarter. So, what would happen if we replaced the neurons with people? People have lots of capabilities. They know lots of things about the world; they can perceive things in a broadly competent, human way. What would happen if you had a network of people in which you could reinforce the connections that were helping and minimize the connections that weren’t? That begins to sound like a society, or a company. We all live in a human social network. We’re reinforced for doing things that seem to help everybody and discouraged from doing things that are not appreciated. Culture is the result of this sort of human AI as applied to human problems; it is the process of building social structures by reinforcing the good connections and penalizing the bad. Once you’ve realized you can take this general AI framework and create a human AI, the question becomes, What’s the right way to do that? Is it a safe idea? Is it completely crazy? My students and I are looking at how people make decisions, on huge databases of financial decisions, business decisions, and many other sorts of decisions. What we’ve found is that humans often make decisions in a way that mimics AI credit-assignment algorithms and works to make the community smarter. A particularly interesting feature of this work is that it addresses a classic problem in evolution known as the group- selection problem. The core of this problem is: How can we select for culture in evolution, when it’s the individuals that reproduce? What you need is something that selects for the best cultures and the best groups but also selects for the best individuals, because they’ re the units that transmit the genes. When you frame the question this way and go through the mathematical literature, you discover that there’s one generally best way to do this. It’s called “distributed Thompson sampling,” a mathematical algorithm used in choosing, out of a set of possible actions with unknown payoffs, the action that maximizes the expected reward in respect to the actions. The key is social sampling, a way of combining evidence, of exploring and exploiting at the same time. It has the unusual property of simultaneously being the best strategy both for the individual and for the group. If you use the group as the basis of selection, and then the group either gets wiped out or reinforced, you’re also selecting for successful individuals. If you select for individuals, and each individual does what’s good for him or her, then that’s automatically the best thing for the group. It’s an amazing alignment of interests and utilities, and it provides real insight into the question of how culture fits into natural selection. 137 HOUSE_OVERSIGHT_016940
Social sampling, very simply, is looking around you at the actions of people who are like you, finding what’s popular, and then copying it if it seems like a good idea to you. Idea propagation has this popularity function driving it, but individual adoption also is about figuring out how the idea works for the individual—a reflective attitude. When you combine social sampling and personal judgment, you get superior decision making. That’s amazing, because now we have a mathematical recipe for doing with humans what all those AI techniques are doing with dumb computer neurons. We have a way of putting people together to make better decisions, given more and more experience. So, what happens in the real world? Why don’t we do this all the time? Well, people are good at it, but there are ways it can run amok. One of these is through advertising, propaganda, or “fake news.” There are many ways to get people to think something is popular when it’s not, and this destroys the usefulness of social sampling. The way you can make groups of people smarter, the way you can make human ATI, will work only if you can get feedback to them that’s truthful. It must be grounded on whether each person’s actions worked for them or not. That’s the key to AI mechanisms, too. What they do is analyze whether they performed correctly. If so, plus one; if not, minus one. We need that truthful feedback to make this human mechanism work well, and we need good ways of knowing about what other people are doing so that we can correctly assess popularity and the likelihood of this being a good choice. The next step is to build this credit-assignment function, this feedback function, for people, so that we can make a good human-artificial ecosystem—a smart organization and a smart culture. In a way, we need to duplicate some of the early insights that resulted in, for instance, the U.S. census—trying to find basic facts that everybody can agree on and understand so that the transmission of knowledge and culture can happen in a way that’s truthful and social sampling can function efficiently. We can address the problem of building an accurate credit-assignment function in many different settings. In companies, for instance, it can be done with digital ID badges that reveal who’s connected to whom, so that we can assess the pattern of connections in relation to the company’s results on a daily or weekly basis. The credit-assignment function asks whether those connections helped solve problems, or helped invent new solutions, and reinforces the helpful connections. When you can get that feedback quantitatively—which is difficult, because most things aren’t measured quantitatively— both the productivity and the innovation rate within the organization can be significantly improved. This is, for instance, the basis of Toyota’s “continuous improvement” method. A next step is to try to do the same thing but at scale, something I refer to as building a trust network for data. It can be thought of as a distributed system like the Internet, but with the ability to quantitatively measure and communicate the qualities of human society, in the same way that the U.S. census does a pretty good job of telling us about population and life expectancy. We are already deploying prototype examples of trust networks at scale in several countries, based on the data and measurement standards laid out in the U.N. Sustainable Development Goals. On the horizon is a vision of how we can make humanity more intelligent by building a human AI. It’s a vision composed of two threads. One is data that we can all trust—data that have been vetted by a broad community, data where the algorithms are known and monitored, much like the census data we all automatically rely on as at least 138 HOUSE_OVERSIGHT_016941
approximately correct. The other is a fair, data-driven assessment of public norms, policy, and government, based on trusted data about current conditions. This second thread depends on availability of trusted data and so is just beginning to be developed. Trusted data and data-driven assessment of norms, policy, and government together create a credit-assignment function that improves societies’ overall fitness and intelligence. It is precisely at the point of creating greater societal intelligence where fake news, propaganda, and advertising all get in the way. Fortunately, trust networks give us a path forward to building a society more resistant to echo-chamber problems, these fads, these exercises in madness. We have begun to develop a new way of establishing social measurements, in aid of curing some of the ills we see in society today. We’re using open data from all sources, encouraging a fair representation of the things people are choosing, in a curated mathematical framework that can stamp out the echoes and the attempts to manipulate us. On Polarization and Inequality Extreme polarization and segregation by income are almost everywhere in the world today and threaten to tear governments and civil society apart. Increasingly, the media are becoming adrenaline pushers driven by advertising clicks and failing to deliver balanced facts and reasoned discourse—and the degradation of media is causing people to lose their bearings. They don’t know what to believe, and thus they can easily be manipulated. There is a real need to ground our various cultures in trustworthy, data- driven standards that we all agree on, and to be able to know what behaviors and policies work and which don’t. In converting to a digital society, we’ve lost touch with traditional notions of truth and justice. Justice used to be mostly informal and normative. We’ve now formalized it. At the same time, we’ve put it out of reach for most people. Our legal systems are failing us in a way they didn’t before, precisely because they’re now more formal, more digital, less embedded in society. Ideas about justice are very different around the world. One of the core differentiators is this: Do you or your parents remember when the bad guys came with guns and took everything? If you do, your attitude about justice is different from that of the average reader of this essay. Do you come from the upper classes? Or were you somebody who saw the sewers from the inside? Your view of justice depends on your history. A common test I have for U.S. citizens is this: Do you know anybody who owns a pickup truck? It’s the number-one-selling vehicle in the United States, and if you don’t know people like that, you’re out of touch with more than 50 percent of Americans. Physical segregation drives conceptual segregation. Most of America thinks of justice and access and fairness in terms very different from those of the typical, say, Manhattanite. If you look at patterns of mobility—where people go—in a typical city, you find that the people in the top quintile (white-collar working families) and bottom quintile (people who are sometimes on unemployment or welfare) almost never talk to each other. They don’t go to the same places; they don’t talk about the same things. They all live in 139 HOUSE_OVERSIGHT_016942
the same city, nominally, but it’s as if it were two completely different cities—and this is perhaps the most important cause of today’s plague of polarization. On Extreme Wealth Some two hundred of the world’s wealthiest people have pledged to give away more than 50 percent of their wealth either during their lifetimes or in their wills, creating a plurality of voices in the foundation space.*° Bill Gates is probably the most familiar example. He’s decided that if the government won’t do it, he’ll do it. You want mosquito nets? He’ll do it. You want antivirals? He’ll doit. We’re getting different stakeholders to take action in the form of foundations dedicated to public good, and they have different versions of what they consider the public good. This diversity of goals has created a lot of what’s wonderful about the world today. Actions from outside government by organizations like the Ford Foundation and the Sloan Foundation, who bet on things that nobody else would bet on, have changed the world for the better. Sure, these billionaires are human, with human foibles, and all is not necessarily as it should be. On the other hand, the same situation obtained when the railways were first built. Some people made huge fortunes. A lot of people went bust. We, the average people, got railways out of it. That’s good. Same thing with electric power; same thing with many new technologies. There’s a churning process that throws somebody up and later casts them or their heirs down. Bubbles of extreme wealth were a feature of the late 1800s and early 1900s when steam engines and railways and electric lights were invented. The fortunes they created were all gone within two or three generations. If the U.S. were like Europe, I would worry. What you find in Europe is that the same families have held on to wealth for hundreds of years, so they’re entrenched not just in terms of wealth but of the political system and in other ways. But so far, the U.S. has avoided this kind of hereditary class system. Extreme wealth hasn’t stuck, which is good. It shouldn’t stick. If you win the lottery, you get your billion dollars, but your grandkids ought to work for a living. On AI and Society People are scared about AI. Perhaps they should be. But they need to realize that AI feeds on data. Without data, AI is nothing. You don’t have to watch the AI; instead you should watch what it eats and what it does. The trust-network framework we’ ve set up, with the help of nations in the E.U. and elsewhere, is one where we can have our algorithms, we can have our AI, but we get to see what went in and what went out, so that we can ask, Is this a discriminatory decision? Is this the sort of thing that we want as humans? Or is this something that’s a little weird? The most revealing analogy is that regulators, bureaucracies, and parts of the government are very much like Als: They take in the rules that we call law and regulation, and they add government data, and they make decisions that affect our lives. The part that’s bad about the current system is that we have very little oversight of these departments, regulators, and bureaucracies. The only control we have is the vote—the opportunity to elect somebody different. We need to make oversight of bureaucracies a lot more fine-grained. We need to record the data that went into every single decision 36 https://givingpledge.org/About.aspx. 140 HOUSE_OVERSIGHT_016943
and have the results analyzed by the various stakeholders—trather like elected legislatures were originally intended to do. If we have the data that go into and out of each decision, we can easily ask, Is this a fair algorithm? Is this AI doing things that we as humans believe are ethical? This human-in-the-loop approach is called “open algorithms;” you get to see what the Als take as input and what they decide using that input. If you see those two things, you’ll know whether they’ re doing the right thing or the wrong thing. It turns out that’s not hard to do. If you control the data, then you control the AI. One thing people often fail to mention is that all the worries about AI are the same as the worries about today’s government. For most parts of the government—the justice system, et cetera—there’s no reliable data about what they’re doing and in what situation. How can you know whether the courts are fair or not if you don’t know the inputs and the outputs? The same problem arises with AI systems and is addressable in the same way. We need trusted data to hold current government to account in terms of what they take in and what they put out, and AI should be no different. Next-Generation AI Current AI machine-learning algorithms are, at their core, dead simple stupid. They work, but they work by brute force, so they need hundreds of millions of samples. They work because you can approximate anything with lots of little simple pieces. That’s a key insight of current AI research—that if you use reinforcement learning for credit- assignment feedback, you can get those little pieces to approximate whatever arbitrary function you want. But using the wrong functions to make decisions means the AI’s ability to make good decisions won’t generalize. If we give the AI new, different inputs, it may make completely unreasonable decisions. Or if the situation changes, then you need to retrain it. There are amusing techniques to find the “null space” in these AI systems. These are inputs that the AI thinks are valid examples of what it was trained to recognize (e.g., faces, cats, etc.), but to a human they’re crazy examples. Current AI is doing descriptive statistics in a way that’s not science and would be almost impossible to make into science. To build robust systems, we need to know the science behind data. The systems I view as next-generation Als result from this science- based approach: If you’re going to create an AI to deal with something physical, then you should build the laws of physics into it as your descriptive functions, in place of those stupid little neurons. For instance, we know that physics uses functions like polynomials, sine waves, and exponentials, so those should be your basis functions and not little linear neurons. By using those more appropriate basis functions, you need a lot less data, you can deal with a lot more noise, and you get much better results. As in the physics example, if we want to build an AI to work with human behavior, then we need to build the statistical properties of human networks into machine-learning algorithms. When you replace the stupid neurons with ones that capture the basics of human behavior, then you can identify trends with very little data, and you can deal with huge levels of noise. The fact that humans have a “commonsense” understanding that they bring to most problems suggests what I call the human strategy: Human society is a network just like the neural nets trained for deep learning, but the “neurons” in human society are a lot 141 HOUSE_OVERSIGHT_016944
smarter. You and I have surprisingly general descriptive powers that we use for understanding a wide range of situations, and we can recognize which connections should be reinforced. That means we can shape our social networks to work much better and potentially beat all that machine-based AI at its own game. 142 HOUSE_OVERSIGHT_016945
“URGENT!” URGENT!” the cc’d copy of an email screamed, one of a dozen emails that greeted me as I turned on my phone at the baggage carousel at Malpensa Airport after the long flight from JFK. “The great American visionary thinker John Brockman arrives this morning at Grand Hotel Milan. You MUST, repeat MUST pay him a visit.” It was signed HUO. The prior evening, waiting in the lounge at JFK, I had had the bright idea to write my friend and longtime collaborator, the London-based, peripatetic art curator Hans Ulrich Obrist (known to all as HUO), and ask if there was anyone in Milan I should know. Once I was settled at the hotel, the phone began ringing and a procession of leading Italian artists, designers, and architects called to request a meeting, including Enzo Mari, the modernist artist and furniture designer; Alberto Garutti, whose aesthetic strategies have inspired a dialogue between contemporary art, spectator, and public space; and fashion designer Miuccia Prada, who “requests your presence for tea this afternoon at Prada headquarters.” And thus, thanks to HUO, did the jet-lagged “great American visionary thinker” stumble and mumble his way through his first day in Milan, November 2011. HUO is sui generis: He lives a twenty-four-hour day, sleeping (I guess) whenever, and employing full-time assistants who work eight-hour shifts and are available to him 24/7. Over a recent two-year period, he visited art venues in either China or India for forty weekends each year—departing London Thursday evening, back at his desk on Monday. Last year, once again, ArtReview ranked him #1 on their annual “Power 100” list. Recently we collaborated on a panel during the “GUEST, GHOST, HOST: MACHINE!” Serpentine event that took place at London’s new City Hall. We were joined by Venki Ramakrishnan, Jaan Tallinn, and Andrew Blake, research director of The Alan Turing Institute. The event was consistent with HUO’s mission of bringing together art and science: “The curator is no longer understood simply as the person who fills a space with objects,” he says, “but also as the person who brings different cultural spheres into contact, invents new display features, and makes junctions that allow unexpected encounters and results.” 143 HOUSE_OVERSIGHT_016946
MAKING THE INVISIBLE VISIBLE: ART MEETS AI Hans Ulrich Obrist Hans Ulrich Obrist is artistic director of the Serpentine Gallery, London, and the author of Ways of Curating and Lives of the Artists, Lives of the Architects. In the Introduction to the second edition of his book Understanding Media, Marshall McLuhan noted the ability of art to “anticipate future social and technological developments.” Art is “an early alarm system,” pointing us to new developments in times ahead and allowing us “to prepare to cope with them... . Art as a radar environment takes on the function of indispensable perceptual training. . . .” In 1964, when McLuhan’s book was first published, the artist Nam June Paik was just building his Robot K-456 to experiment with the technologies that subsequently would start to influence society. He had worked with television earlier, challenging its usual passive consumption by the viewer, and later made art with global live-satellite broadcasts, using the new media less for entertainment than to point us to their poetic and intercultural capacities (which are still mostly unused today). The Paiks of our time, of course, are now working with the Internet, digital images, and artificial intelligence. Their works and thoughts, again, are an early alarm system for the developments ahead of us. As a curator, my daily work is to bring together different works of art and connect different cultures. Since the early 1990s, I have also been organizing conversations and meetings with practitioners from different disciplines, in order to go beyond the general reluctance to pool knowledge. Since I was interested in hearing what artists have to say about artificial intelligence, I recently organized several conversations between artists and engineers. The reason to look closely at AI is that two of the most important questions of today are “How capable will AI become?” and “What dangers may arise from it?” Its early applications already influence our everyday lives in ways that are more or less recognizable. There is an increasing impact on many aspects of our society, but whether this might be, in general, beneficial or malign is still uncertain. Many contemporary artists are following these developments closely. They are articulating various doubts about the promises of AI and reminding us not to associate the term “artificial intelligence” solely with positive outcomes. To the current discussions of AI, the artists contribute their specific perspectives and notably their focus on questions of image making, creativity, and the use of programming as artistic tools. The deep connections between science and art had already been noted by the late Heinz von Foerster, one of the architects of cybernetics, who worked with Norbert Wiener from the mid-1940s and in the 1960s founded the field of second-order cybernetics, in which the observer is understood as part of the system itself and not an external entity. I knew von Foerster well, and in one of our many conversations, he offered his views on the relation between art and science: I’ve always perceived art and science as complementary fields. One shouldn’t forget that a scientist is in some respects also an artist. He invents anew technique and he describes it. He uses language like a poet, or the author of a detective novel, and describes his findings. In my view, a scientist must work in 144 HOUSE_OVERSIGHT_016947
an artistic way if he wants to communicate his research. He obviously wants to communicate and talk to others. A scientist invents new objects, and the question is how to describe them. In all of these aspects, science is not very different from art. When I asked him how he defined cybernetics, von Foerster answered: The substance of what we have learned from cybemetics is to think in circles: A leads B, B to C, but C can return to A. Such kinds of arguments are not linear but circular. The significant contribution of cybernetics to our thinking is to accept circular arguments. This means that we have to look at circular processes and understand under which circumstances an equilibrium, and thus a stable structure, emerges. Today, where AI algorithms are applied in daily tasks, one can ask how the human factor is included in these kinds of processes and what role creativity and art could play in relation to them. There are thus different levels to think about when exploring the relation between AI and art. So, what do contemporary artists have to say about artificial intelligence? Artificial Stupidity Hito Steyerl, an artist who works with documentary and experimental film, considers two key aspects that we should keep in mind when reflecting on the implications of AI for society. First, the expectations for so-called artificial intelligence, she says, are often overrated, and the noun “intelligence” is misleading; to counter that, she uses the term “artificial stupidity.” Second, she points out that programmers are now making invisible software algorithms visible through images, but to understand and interpret these images better, we should apply the expertise of artists. Steyerl has worked with computer technology for many years, and her recent artworks have explored surveillance techniques, robots, and such computer games as in How Not to Be Seen (2013), on digital-image technologies, or HellYeahWeFuckDie (2017), about the training of robots in the still-difficult task of keeping balance. But to explain her notion of artificial stupidity, Steyerl refers to a more general phenomenon, like the now widespread use of Twitter bots, noting in our conversation: It was and still is a very popular tool in elections to deploy Twitter armies to sway public opinion and deflect popular hashtags and so on. This is an artificial intelligence of a very, very low grade. It’s two or maybe three lines of script. It’s nothing very sophisticated at all. Yet the social implications of this kind of artificial stupidity, as I call it, are already monumental in global politics. As has been widely noted, this kind of technology was seen in the many automated Twitter posts before the 2016 U.S. presidential election and also shortly before the Brexit vote. If even low-grade AI technology like these bots are already influencing our politics, this raises another urgent question: “How powerful will far more advanced techniques be in the future?” 145 HOUSE_OVERSIGHT_016948
Visible / Invisible The artist Paul Klee often talked about art as “making the invisible visible.” In computer technology, most algorithms work invisibly, in the background; they remain inaccessible in the systems we use daily. But lately there has been an interesting comeback of visuality in machine learning. The ways that the deep-learning algorithms of AI are processing data have been made visible through applications like Google’s DeepDream, in which the process of computerized pattern-recognition is visualized in real time. The application shows how the algorithm tries to match animal forms with any given input. There are many other AI visualization programs that, in their way, also “make the invisible visible.” The difficulty in the general public perception of such images is, in Steyerl’s view, that these visual patterns are viewed uncritically as realistic and objective representations of the machine process. She says of the aesthetics of such visualizations: For me, this proves that science has become a subgenre of art history. ... We now have lots of abstract computer patterns that might look like a Paul Klee painting, or a Mark Rothko, or all sorts of other abstractions that we know from art history. The only difference, I think, is that in current scientific thought they’re perceived as representations of reality, almost like documentary images, whereas in art history there’s a very nuanced understanding of different kinds of abstraction. What she seeks is a more profound understanding of computer-generated images and the different aesthetic forms they use. They are obviously not generated with the explicit goal of following a certain aesthetic tradition. The computer engineer Mike Tyka, in a conversation with Steyerl, explained the functions of these images: Deep-learning systems, especially the visual ones, are really inspired by the need to know what’s going on in the black box. Their goal is to project these processes back into the real world. Nevertheless, these images have aesthetic implications and values which have to be taken into account. One could say that while the programmers use these images to help us better understand the programs’ algorithms, we need the knowledge of artists to better understand the aesthetic forms of AI. As Steyerl has pointed out, such visualizations are generally understood as “true” representations of processes, but we should pay attention to their respective aesthetics, and their implications, which have to be viewed in a critical and analytical way. In 2017, the artist Trevor Paglen created a project to make these invisible AI algorithms visible. In Sight Machine, he filmed a live performance of the Kronos Quartet and processed the resulting images with various computer software programs used for face detection, object identification, and even for missile guidance. He projected the outcome of these algorithms, in real time, back to screens above the stage. By demonstrating how the various different programs interpreted the musicians’ performance, Paglen showed that AI algorithms are always determined by sets of values and interests which they then manifest and reiterate, and thus must be critically questioned. The significant contrast between algorithms and music also raises the issue of relationships between technical and human perception. 146 HOUSE_OVERSIGHT_016949
Computers, as a Tool for Creativity, Can’t Replace the Artist. Rachel Rose, a video artist who thinks about the questions posed by AI, employs computer technology in the creation of her works. Her films give the viewer an experience of materiality through the moving image. She uses collaging and layering of the material to manipulate sound and image, and the editing process is perhaps the most important aspect of her work. She also talks about the importance of decision making in her work. For her, the artistic process does not follow a rational pattern. In a converation we had, together with the engineer Kenric McDowell, at the Google Cultural Institute, she explained this by citing a story from theater director Peter Brook’s 1968 book The Empty Space. When Brook designed the set for his production of 7he Tempest in the late 1960s, he started by making a Japanese garden, but then the design evolved, becoming a white box, a black box, a realistic set, and so on. And in the end, he returned to his original idea. Brook writes that he was shocked at having spent a month on his labors, only to end at the beginning. But this shows that the creative artistic process is a succession whose every step builds on the next and which eventually comes to an unpredictable conclusion. The process is not a logical or rational succession but has mostly to do with the artist’s feelings in reaction to the preceding result. Rose said, of her own artistic decision making: It, to me, is distinctively different from machine learning, because at each decision there’s this core feeling that comes from a human being, which has to do with empathy, which has to do with communication, which has to do with questions about our own mortality that only a human could ask. This point underlines the fundamental difference between any human artistic production and so-called computer creativity. Rose sees AI more as a possible way to create better tools for humans: A place I can imagine machine learning working for an artist would be not in developing an independent subjectivity, like writing a poem or making an image, but actually in filling in gaps that are to do with labor, like the way that Photoshop works with different tools that you can use. And though such tools may not seem spectacular, she says, “they might have a larger influence on art,” because they provide artists with further possibilities in their creative work. McDowell added that he, too, believes there are false expectations around AI. “Tve observed,” he said, “that there’s a sort of magical quality to the idea of a computer that does all the things that we do.” He continued: “There’s almost this kind of demonic mirror that we look into, and we want it to write a novel, we want it to make a film—we want to give that away somehow.” He is instead working on projects wherein humans collaborate with the machine. One of the current aims of AI research is to find new means of interaction between humans and software. And art, one could say, needs to play a key role in that enterprise, since it focuses on our subjectivity and on essential human aspects like empathy and mortality. 147 HOUSE_OVERSIGHT_016950
Cybernetics / Art Suzanne Treister is an artist whose work from 2009 to 2011 serves as an example of what is happening at the intersection of our current technologies, the arts, and cybernetics. Treister has been a pioneer in digital art since the 1990s, inventing, for example, imaginary video games and painting screen shots from them. In her project Hexen 2.0 she looked back at the famous Macy conferences on cybernetics that between 1946 and 1953 were organized in New York by engineers and social scientists to unite the sciences and to develop a universal theory of the workings of the mind. In her project, she created thirty photo-text works about the conference attendees (which included Wiener and von Foerster), she invented tarot cards, and she made a video based on a photomontage of a “cybernetic séance.” In the “séance,” the conference participants are seen sitting at a round table, as in spiritualist séances, while certain of their statements on cybernetics are heard in an audio-collage—trational knowledge and superstition combined. She also noted that some of the participating scientists worked for the military; thus the application of cybernetics could be seen in an ambivalent way, even back then, as a tussle between pure knowledge and its use in state control. If one looks at Treister’s work about the Macy conference participants, one sees that no visual artist was included. A dialogue between artists and scientists would be fruitful in future discussions, and it is a bit astonishing that this wasn’t realized at the time, given von Foerster’s keen interest in art. He recounted in one of our conversations how his relation to the field dated back to his childhood: I grew up as a child in an artistic family. We often had visits from poets, philosophers, painters, and sculptors. Art was a part of my life. Later, I got into physics, as I was talented in this subject. But I always remained conscious of the importance of art for science. There wasn’t a great difference for me. For me, both aspects of life have always been very much alike—and accessible, too. We should see them as one. An artist also has to reflect on his work. He has to think about his grammar and his language. A painter must know how to handle his colors. Just think of how intensively oil colors were researched during the Renaissance. They wanted to know how a certain pigment could be mixed with others to get a certain tone of red or blue. Chemists and painters collaborated very closely. I think the artificial division between science and art is wrong. Though for von Foerster the relation between the art and science was always clear, for our own time this connection remains to be made. There are many reasons to multiply the links. The critical thinking of artists would be beneficial in respect to the dangers of AI, since they draw our attention to questions they consider essential from their perspective. With the advent of machine learning, new tools are available to artists for their work. And as the algorithms of AI are made visible through artificial images in new ways, artists’ critical visual knowledge and expertise will be harnessed. Many of the key questions of AI are philosophical in nature and can be answered only from a holistic point of view. The way they play out among adventurous artists will be worth following. Simulating Worlds For the most part, the works of contemporary artists have been embodied ruminations on 148 HOUSE_OVERSIGHT_016951
AI’s impact on existential questions of the self and our future interaction with nonhuman entities. Few, though, have taken the technologies and innovations of AI as the underlying materials of their work and sculpted them to their own vision. An exception is the artist Ian Cheng, who has gone as far as to construct entire worlds of artificial beings with varying degrees of sentience and intelligence. He refers to these worlds as Live Simulations. His Emissaries trilogy (2015-2017) is set in a fictional postapocalyptic world of flora and fauna, in which AI-driven animals and creatures explore the landscape and interact with each other. Cheng uses advanced graphics but has them programmed with a lot of glitches and imperfections, which imparts a futuristic and anachronistic atmosphere at the same time. Through his trilogy, which charts a history of consciousness, he asks the question “What is a simulation?” While the majority of artistic works that utilize recent developments in AI specifically draw from the field of machine learning, Cheng’s Live Simulations take a separate route. The protagonists and plot lines that are interlaced in each episodic simulation of Emissaries use the complex logic systems and rules of AI. What is profound about his continually evolving scenes is that complexity arises not through the desire/actions of any single actor or artificial godhead but instead through their constellation, collision, and constant evolution in symbiosis with one another. This gives rise to unexpected outcomes and unending, unknowable situations—you can never experience the exact same moment in successive viewings of his work. Cheng had a discussion at the Serpentine Marathon “GUEST, GHOST, HOST: MACHINE!” with the programmer Richard Evans, who recently designed Versu, an AI- based platform for interactive storytelling games. Evans’ work emphasizes the social interaction of the games’ characters, who react in a spectrum of possible behaviors to the choices made by the human players. In their conversation, Evans said that a starting point for the project was that most earlier simulation video games, such as The Sims, did not sufficiently take into account the importance of social practices. Simulated protagonists in games would often act in ways that did not correspond well with real human behavior. Knowledge of social practices limits the possibilities of action but is necessary to understand the meaning of our actions—which is what interests Cheng for his own simulations. The more parameters of actions in certain circumstances are determined in a computer simulation, the more interesting it is for Cheng to experiment with individual and specific changes. He told Evans, “I gather that if we had AI with more ability to respond to social contexts, tweaking one thing, you would get something quite artistic and beautiful.” Cheng also sees the work of programmers and AI simulations as creating new and sophisticated tools for experimenting with the parameters of our daily social practices. In this way, the involvement of artists in AI will lead to new kinds of open experiments in Art. Such possibilities are—like increased AI capabilities in general—still in the future. Recognizing that this is an experimental technology in its infancy, very far from apocalyptic visions of a superintelligent AI takeover, Cheng fills his simulations with prosaic avatars such as strange microbial globules, dogs, and the undead. Discussions like these, between artists and engineers, of course are not totally new. In the 1960s, the engineer Billy Kliiver brought artists together with engineers in a series of events, and in 1967 he founded the Experiments in Art and Technology program with Robert Rauschenberg and others. In London, at around the same time, Barbara 149 HOUSE_OVERSIGHT_016952
Stevini and John Latham, of the Artist Placement Group, took things a step further by asserting that there should be artists in residence in every company and every government. Today, these inspiring historical models can be applied to the field of AI. As AI comes to inhabit more and more of our everyday lives, the creation of a space that is nondeterministic and non-utilitarian in its plurality of perspectives and diversity of understandings will undoubtedly be essential. 150 HOUSE_OVERSIGHT_016953
Alison Gopnik is an international leader in the field of children’s learning and development and was one of the founders of the field of “theory of mind.” She has spoken of the child brain as a “powerful learning computer,” perhaps from personal experience. Her own Philadelphia childhood was an exercise in intellectual development. “Other families took their kids to see The Sound of Music or Carousel; we saw Racine ’s Phaedra and Samuel Beckett’s Endgame,” she has recalled. “Our family read Henry Fielding’s 18th-century novel Joseph Andrews out loud to each other around the fire on camping trips.” Lately she has invoked Bayesian models of machine learning to explain the remarkable ability of preschoolers to draw conclusions about the world around them without benefit of enormous data sets. “I think babies and children are actually more conscious than we are as adults,” she has said. “They’re very good at taking in lots of information from lots of different sources at once.” She has referred to babies and young children as “the research and development division of the human species.” Not that she treats them coldly, as if they were mere laboratory animals. They appear to revel in her company, and in the blinking, thrumming toys in her Berkeley lab. For years after her own children had outgrown it, she kept a playpen in her office. Her investigations into just how we learn, and the parallels to the deep-learning methods of AI, continues. “Tt turns out to be much easier to simulate the reasoning of a highly trained adult expert than to mimic the ordinary learning of every baby,” she says. “Computation is still the best—indeed, the only—scientific explanation we have of how a Physical object like a brain can act intelligently. But, at least for now, we have almost no idea at all how the sort of creativity we see in children is possible.” 151 HOUSE_OVERSIGHT_016954
Als VERSUS FOUR-YEAR-OLDS Alison Gopnik Alison Gopnik is a developmental psychologist at UC Berkeley; her books include The Philosophical Baby and, most recently, The Gardener and the Carpenter: What the New Science of Child Development Tells Us About the Relationship Between Parents and Children. Everyone’s heard about the new advances in artificial intelligence, and especially machine learning. You’ve also heard utopian or apocalyptic predictions about what those advances mean. They have been taken to presage either immortality or the end of the world, and a lot has been written about both those possibilities. But the most sophisticated Als are still far from being able to solve problems that human four-year- olds accomplish with ease. In spite of the impressive name, artificial intelligence largely consists of techniques to detect statistical patterns in large data sets. There is much more to human learning. How can we possibly know so much about the world around us? We learn an enormous amount even when we are small children; four-year-olds already know about plants and animals and machines; desires, beliefs, and emotions; even dinosaurs and spaceships. Science has extended our knowledge about the world to the unimaginably large and the infinitesimally small, to the edge of the universe and the beginning of time. And we use that knowledge to make new classifications and predictions, imagine new possibilities, and make new things happen in the world. But all that reaches any of us from the world is a stream of photons hitting our retinas and disturbances of air at our eardrums. How do we learn so much about the world when the evidence we have is so limited? And how do we do all this with the few pounds of grey goo that sits behind our eyes? The best answer so far is that our brains perform computations on the concrete, particular, messy data arriving at our senses, and those computations yield accurate representations of the world. The representations seem to be structured, abstract, and hierarchical; they include the perception of three-dimensional objects, the grammars that underlie language, and mental capacities like “theory of mind,” which lets us understand what other people think. Those representations allow us to make a wide range of new predictions and imagine many new possibilities in a distinctively creative human way. This kind of learning isn’t the only kind of intelligence, but it’s a particularly important one for human beings. And it’s the kind of intelligence that is a specialty of young children. Although children are dramatically bad at planning and decision making, they are the best learners in the universe. Much of the process of turning data into theories happens before we are five. Since Aristotle and Plato, there have been two basic ways of addressing the problem of how we know what we know, and they are still the main approaches in machine learning. Aristotle approached the problem from the bottom up: Start with senses—the stream of photons and air vibrations (or the pixels or sound samples of a digital image or recording)—and see if you can extract patterns from them. This approach was carried further by such classic associationists as philosophers David Hume 152 HOUSE_OVERSIGHT_016955
and J. S. Mill and later by behavioral psychologists, like Pavlov and B. F. Skinner. On this view, the abstractness and hierarchical structure of representations is something of an illusion, or at least an epiphenomenon. All the work can be done by association and pattern detection—especially if there are enough data. Over time, there has been a seesaw between this bottom-up approach to the mystery of learning and Plato’s alternative, top-down one. Maybe we get abstract knowledge from concrete data because we already know a lot, and especially because we already have an array of basic abstract concepts, thanks to evolution. Like scientists, we can use those concepts to formulate hypotheses about the world. Then, instead of trying to extract patterns from the raw data, we can make predictions about what the data should look like if those hypotheses are right. Along with Plato, such “rationalist” philosophers and psychologists as Descartes and Noam Chomsky took this approach. Here’s an everyday example that illustrates the difference between the two methods: solving the spam plague. The data consist of a long unsorted list of messages in your in-box. The reality is that some of these messages are genuine and some are spam. How can you use the data to discriminate between them? Consider the bottom-up technique first. You notice that the spam messages tend to have particular features: a long list of addressees, origins in Nigeria, references to million-dollar prizes or Viagra. The trouble is that perfectly useful messages might have these features, too. If you looked at enough examples of spam and non-spam emails, you might see not only that spam emails tend to have those features but that the features tend to go together in particular ways (Nigeria plus a million dollars spells trouble). In fact, there might be some subtle higher-level correlations that discriminate the spam messages from the useful ones—a particular pattern of misspellings and IP addresses, say. If you detect those patterns, you can filter out the spam. The bottom-up machine-learning techniques do just this. The learner gets millions of examples, each with some set of features and each labeled as spam (or some other category) or not. The computer can extract the pattern of features that distinguishes the two, even if it’s quite subtle. How about the top-down approach? I get an email from the editor of the Journal of Clinical Biology. It refers to one of my papers and says that they would like to publish an article by me. No Nigeria, no Viagra, no million dollars; the email doesn’t have any of the features of spam. But by using what I already know, and thinking in an abstract way about the process that produces spam, I can figure out that this email is suspicious. (1) I know that spammers try to extract money from people by appealing to human greed. (2) I also know that legitimate “open access” journals have started covering their costs by charging authors instead of subscribers, and that I don’t practice anything like clinical biology. Put all that together and I can produce a good new hypothesis about where that email came from. It’s designed to sucker academics into paying to “publish” an article in a fake journal. The email was a result of the same dubious process as the other spam emails, even though it looked nothing like them. I can draw this conclusion from just one example, and I can go on to test my hypothesis further, beyond anything in the email itself, by googling the “editor.” 153 HOUSE_OVERSIGHT_016956
In computer terms, I started out with a “generative model” that includes abstract concepts like greed and deception and describes the process that produces email scams. That lets me recognize the classic Nigerian email spam, but it also lets me imagine many different kinds of possible spam. When I get the journal email, I can work backward: “This seems like just the kind of mail that would come out of a spam-generating process.” The new excitement about AI comes because AI researchers have recently produced powerful and effective versions of both these learning methods. But there is nothing profoundly new about the methods themselves. Bottom-up Deep Learning In the 1980s, computer scientists devised an ingenious way to get computers to detect patterns in data: connectionist, or neural-network, architecture (the “neural” part was, and still is, metaphorical). The approach fell into the doldrums in the ’90s but has recently been revived with powerful “deep-learning” methods like Google’s DeepMind. For example, you can give a deep-learning program a bunch of Internet images labeled “cat,” others labeled “house,” and so on. The program can detect the patterns differentiating the two sets of images and use that information to label new images correctly. Some kinds of machine learning, called unsupervised learning, can detect patterns in data with no labels at all; they simply look for clusters of features—what scientists call a factor analysis. In the deep-learning machines, these processes are repeated at different levels. Some programs can even discover relevant features from the raw data of pixels or sounds; the computer might begin by detecting the patterns in the raw image that correspond to edges and lines and then find the patterns in those patterns that correspond to faces, and so on. Another bottom-up technique with a long history is reinforcement learning. In the 1950s, B. F. Skinner, building on the work of John Watson, famously programmed pigeons to perform elaborate actions—even guiding air-launched missiles to their targets (a disturbing echo of recent AI) by giving them a particular schedule of rewards and punishments. The essential idea was that actions that were rewarded would be repeated and those that were punished would not, until the desired behavior was achieved. Even in Skinner’s day, this simple process, repeated over and over, could lead to complex behavior. Computers are designed to perform simple operations over and over on a scale that dwarfs human imagination, and computational systems can learn remarkably complex skills in this way. For example, researchers at Google’s DeepMind used a combination of deep learning and reinforcement learning to teach a computer to play Atari video games. The computer knew nothing about how the games worked. It began by acting randomly and got information only about what the screen looked like at each moment and how well it had scored. Deep learning helped interpret the features on the screen, and reinforcement learning rewarded the system for higher scores. The computer got very good at playing several of the games, but it also completely bombed on others just as easy for humans to master. A similar combination of deep learning and reinforcement learning has enabled the success of DeepMind’s AlphaZero, a program that managed to beat human players at both chess and Go, equipped only with a basic knowledge of the rules of the game and 154 HOUSE_OVERSIGHT_016957
some planning capacities. AlphaZero has another interesting feature: It works by playing hundreds of millions of games against itself. As it does so, it prunes mistakes that led to losses, and it repeats and elaborates on strategies that led to wins. Such systems, and others involving techniques called generative adversarial networks, generate data as well as observing data. When you have the computational power to apply those techniques to very large data sets or millions of email messages, Instagram images, or voice recordings, you can solve problems that seemed very difficult before. That’s the source of much of the excitement in computer science. But it’s worth remembering that those problems—like recognizing that an image is a cat or a spoken word is “Siri” —are trivial for a human toddler. One of the most interesting discoveries of computer science is that problems that are easy for us (like identifying cats) are hard for computers—much harder than playing chess or Go. Computers need millions of examples to categorize objects that we can categorize with just afew. These bottom-up systems can generalize to new examples; they can label a new image as a “cat” fairly accurately, over all. But they do so in ways quite different from how humans generalize. Some images almost identical to a cat image won’t be identified by us as cats at all. Others that look like a random blur will be. Top-down Bayesian Models The top-down approach played a big role in early AI, and in the 2000s it, too, experienced a revival, in the form of probabilistic, or Bayesian, generative models. The early attempts to use this approach faced two kinds of problems. First, most patterns of evidence might in principle be explained by many different hypotheses: It’s possible that my journal email message is genuine, it just doesn’t seem likely. Second, where do the concepts that the generative models use come from in the first place? Plato and Chomsky said you were born with them. But how can we explain how we learn the latest concepts of science? Or how even young children understand about dinosaurs and rocket ships? Bayesian models combine generative models and hypothesis testing with probability theory, and they address these two problems. A Bayesian model lets you calculate just how likely it is that a particular hypothesis 1s true, given the data. And by making small but systematic tweaks to the models we already have, and testing them against the data, we can sometimes make new concepts and models from old ones. But these advantages are offset by other problems. The Bayesian techniques can help you choose which of two hypotheses is more likely, but there are almost always an enormous number of possible hypotheses, and no system can efficiently consider them all. How do you decide which hypotheses are worth testing in the first place? Brenden Lake at NYU and colleagues have used these kinds of top-down methods to solve another problem that’s easy for people but extremely difficult for computers: recognizing unfamiliar handwritten characters. Look at a character on a Japanese scroll. Even if you’ve never seen it before, you can probably tell if it’s similar to or different from a character on another Japanese scroll. You can probably draw it and even design a fake Japanese character based on the one you see—one that will look quite different from a Korean or Russian character. *” 37 Brenden M. Lake, Ruslan Salakhutdinov & Joshua B. Tenenbaum, “Human-level concept learning through probabilistic program induction,” Science, 350:6266, pp. 1332-38 (2015). 155 HOUSE_OVERSIGHT_016958
The bottom-up method for recognizing handwritten characters is to give the computer thousands of examples of each one and let it pull out the salient features. Instead, Lake ef al. gave the program a general model of how you draw a character: A stroke goes either right or left; after you finish one, you start another; and so on. When the program saw a particular character, it could infer the sequence of strokes that were most likely to have led to it—just as I inferred that the spam process led to my dubious email. Then it could judge whether a new character was likely to result from that sequence or from a different one, and it could produce a similar set of strokes itself. The program worked much better than a deep-learning program applied to exactly the same data, and it closely mirrored the performance of human beings. These two approaches to machine learning have complementary strengths and weaknesses. In the bottom-up approach, the program doesn’t need much knowledge to begin with, but it needs a great deal of data, and it can generalize only in a limited way. In the top-down approach, the program can learn from just a few examples and make much broader and more varied generalizations, but you need to build much more into it to begin with. A number of investigators are currently trying to combine the two approaches, using deep learning to implement Bayesian inference. The recent success of AI 1s partly the result of extensions of those old ideas. But it has more to do with the fact that, thanks to the Internet, we have much more data, and thanks to Moore’s Law we have much more computational power to apply to that data. Moreover, an unappreciated fact is that the data we do have has already been sorted and processed by human beings. The cat pictures posted to the Web are canonical cat pictures—pictures that humans have already chosen as “good” pictures. Google Translate works because it takes advantage of millions of human translations and generalizes them to a new piece of text, rather than genuinely understanding the sentences themselves. But the truly remarkable thing about human children is that they somehow combine the best features of each approach and then go way beyond them. Over the past fifteen years, developmentalists have been exploring the way children learn structure from data. Four-year-olds can learn by taking just one or two examples of data, as a top- down system does, and generalizing to very different concepts. But they can also learn new concepts and models from the data itself, as a bottom-up system does. For example, in our lab we give young children a “blicket detector”’—a new machine to figure out, one they’ve never seen before. It’s a box that lights up and plays music when you put certain objects on it but not others. We give children just one or two examples of how the machine works, showing them that, say, two red blocks make it go, while a green-and-yellow combination doesn’t. Even eighteen-month-olds immediately figure out the general principle that the two objects have to be the same to make it go, and they generalize that principle to new examples: For instance, they will choose two objects that have the same shape to make the machine work. In other experiments, we’ve shown that children can even figure out that some hidden invisible property makes the machine go, or that the machine works on some abstract logical principle.** 38 A. Gopnik, T. Griffiths & C. Lucas, “When younger learners can be better (or at least more open- minded) than older ones,” Curr. Dir. Psychol. Sci., 24:2, 87-92 (2015). 156 HOUSE_OVERSIGHT_016959
You can show this in children’s everyday learning, too. Young children rapidly learn abstract intuitive theories of biology, physics, and psychology in much the way adult scientists do, even with relatively little data. The remarkable machine-learning accomplishments of the recent AI systems, both bottom-up and top-down, take place in a narrow and well-defined space of hypotheses and concepts—a precise set of game pieces and moves, a predetermined set of images. In contrast, children and scientists alike sometimes change their concepts in radical ways, performing paradigm shifts rather than simply tweaking the concepts they already have. Four-year-olds can immediately recognize cats and understand words, but they can also make creative and surprising new inferences that go far beyond their experience. My own grandson recently explained, for example, that if an adult wants to become a child again, he should try not eating any healthy vegetables, since healthy vegetables make a child grow into an adult. This kind of hypothesis, a plausible one that no grown- up would ever entertain, is characteristic of young children. In fact, my colleagues and I have shown systematically that preschoolers are better at coming up with unlikely hypotheses than older children and adults.*? We have almost no idea how this kind of creative learning and innovation is possible. Looking at what children do, though, may give programmers useful hints about directions for computer learning. Two features of children’s learning are especially striking. Children are active learners; they don’t just passively soak up data like Als do. Just as scientists experiment, children are intrinsically motivated to extract information from the world around them through their endless play and exploration. Recent studies show that this exploration is more systematic than it looks and is well-adapted to find persuasive evidence to support hypothesis formation and theory choice.*° Building curiosity into machines and allowing them to actively interact with the world might be a route to more realistic and wide-ranging learning. Second, children, unlike existing Als, are social and cultural learners. Humans don’t learn in isolation but avail themselves of the accumulated wisdom of past generations. Recent studies show that even preschoolers learn through imitation and by listening to the testimony of others. But they don’t simply passively obey their teachers. Instead they take in information from others in a remarkably subtle and sensitive way, making complex inferences about where the information comes from and how trustworthy it is and systematically integrating their own experiences with what they are hearing.! “Artificial intelligence” and “machine learning” sound scary. And in some ways they are. These systems are being used to control weapons, for example, and we really should be scared about that. Still, natural stupidity can wreak far more havoc than artificial intelligence; we humans will need to be much smarter than we have been in the past to properly regulate the new technologies. But there is not much basis for either the apocalyptic or the utopian visions of AIs replacing humans. Until we solve the basic 3° A. Gopnik, et al., “Changes in cognitive flexibility and hypothesis search across human life history from childhood to adolescence to adulthood,” Proc. Nat. Acad. Sci., 114:30, 7892-99 (2017). “L. Schulz, “The origins of Inquiry: Inductive inference and exploration in early childhood,” Trends Cog. Sci., 16:7, 382-89 (2012). 4 A. Gopnik, Zhe Gardener and the Carpenter (New York: Farrar, Straus & Giroux, 2016), chaps. 4 and 5. 157 HOUSE_OVERSIGHT_016960
paradox of learning, the best artificial intelligences will be unable to compete with the average human four-year-old. 158 HOUSE_OVERSIGHT_016961
Peter Galison’s focus as a science historian is—speaking roughly—on the intersection of theory with experiment. “For quite a number of years I have been guided in my work by the odd confrontation of abstract ideas and extremely concrete objects,” he once told me, in explaining how he thinks about what he does. At the Washington, Connecticut, meeting he discussed the Cold War tension between engineers (like Wiener) and the administrators of the Manhattan Project (like Oppenheimer: “When [Wiener] warns about the dangers of cybernetics, in part he’s trying to compete against the kind of portentous language that people like Oppenheimer [used]: ‘When I saw the explosion at Trinity, I thought of the Bhagavad Gita—I am death, destroyer of worlds.’ That sense, that physics could stand and speak to the nature of the universe and airforce policy, was repellent and seductive. In away, you can see that over and over again in the last decades—nanosciences, recombinant DNA, cybernetics: ‘I stand reporting to you on the science that has the promise of salvation and the danger of annihilation—and you should pay attention, because this could kill you.’ It’s a very seductive narrative, and it’s repeated in artificial intelligence and robotics.” As a twenty-four-year old, when I first encountered Wiener’s ideas and met his colleagues at the MIT meeting I describe in the book’s Introduction, I was hardly interested in Wiener ’s warnings or admonitions. What drove my curiosity was the stark, radical nature of his view of life, based on the mathematical theory of communications in which the message was nonlinear: According to Wiener, “new concepts of communication and control involved a new interpretation of man, of man’s knowledge of the universe, and of society.”’ And that led to my first book, which took information theory—the mathematical theory of communications—as a model for all human experience. In a recent conversation, Peter told me he was beginning to write a book—about building, crashing, and thinking—that considers the black-box nature of cybernetics and how it represents what he thinks of as “the fundamental transformation of learning, machine learning, cybernetics, and the self.” 159 HOUSE_OVERSIGHT_016962
ALGORISTS DREAM OF OBJECTIVITY Peter Galison Peter Galison is a science historian, Joseph Pellegrino University Professor and co- founder of the Black Hole Initiative at Harvard University, and the author of Einstein's Clocks and Poincaré’s Maps: Empires of Time. In his second-best book, the great medieval mathematician al-Khwarizmi described the new place-based Indian form of arithmetic. His name, soon sonically linked to “algorismus” (in late medieval Latin) came to designate procedures acting upon numbers—eventually wending its way through “algorithm,” (on the model of logarithm”), into French and on into English. But I like the idea of a modern algorist, even if my spellcheck does not. I mean by it someone profoundly suspicious of the intervention of human judgment, someone who takes that judgment to violate the fundamental norms of what it is to be objective (and therefore scientific). Near the end of the 20th century, a paper by two University of Minnesota psychologists summarized a vast literature that had long roiled the waters of prediction. One side, they judged, had for all too long held resolutely—and ultimately unethically— to the “clinical method” of prediction, which prized all that was subjective: “informal,” “in-the-head,” and “impressionistic.” These clinicians were people (so said the psychologists) who thought they could study their subjects with meticulous care, gather in committees, and make judgment-based predictions about criminal recidivism, college success, medical outcomes, and the like. The other side, the psychologists continued, embodied everything the clinicians did not, embracing the objective: “formal,” “mechanical,” “algorithmic.” This the authors took to stand at the root of the whole triumph of post-Galilean science. Not only did science benefit from the actuarial; to a great extent, science was the mechanical-actuarial. Breezing through 136 studies of predictions, across domains from sentencing to psychiatry, the authors showed that in 128 of them, predictions using actuarial tables, a multiple-regression equation, or an algorithmic judgment equalled or exceeded in accuracy those using the subjective approach. They went on to catalog seventeen fallacious justifications for clinging to the clinical. There were the self-interested foot-draggers who feared losing their jobs to machines. Others lacked the education to follow statistical arguments. One group mistrusted the formalization of mathematics; another excoriated what they took to be the actuarial “dehumanizing;” yet others said that the aim was to understand, not to predict. But whatever the motivations, the review concluded that it was downright immoral to withhold the power of the objective over the subjective, the algorithmic over expert judgment.” “? William M. Grove & Paul E. Meehl, “Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures: The Clinical-Statistical Controversy,” Psychology, Public Policy, and Law, 2:2, 293-323 (1996). 160 HOUSE_OVERSIGHT_016963
The algorist view has gained strength. Anne Milgram served as Attorney General of the State of New Jersey from 2007 to 2010. When she took office, she wanted to know who the state was arresting, charging, and jailing, and for what crimes. At the time, she reports in a later TED Talk, she could find almost no data or analytics. By imposing statistical prediction, she continues, law enforcement in Camden during her tenure was able to reduce murders by 41 percent, saving thirty-seven lives, while dropping the total crime rate by 26 percent. After joining the Arnold Foundation as its vice president for criminal justice, she established a team of data scientists and statisticians to create a risk-assessment tool; fundamentally, she construed the team’s mission as deciding how to put “dangerous people” in jail while releasing the non- dangerous. “The reason for this,” Milgram contended, “is the way we make decisions. Judges have the best intentions when they make these decisions about risk, but they’re making them subjectively. They’re like the baseball scouts twenty years ago who were using their instinct and their experience to try to decide what risk someone poses. They’re being subjective, and we know what happens with subjective decision making, which is that we are often wrong.” Her team established nine-hundred-plus risk factors, of which nine were most predictive. The questions, the most urgent questions, for the team were: Will a person commit a new crime? Will that person commit a violent act? Will someone come back to court? We need, concluded Milgram, an “objective measure of risk” that should be inflected by judges’ judgment. We know the algorithmic statistical process works. That, she says, is “why Google is Google” and why moneyball wins games.*? Algorists have triumphed. We have grown accustomed to the idea that protocols and data can and should guide us in everyday action, from reminders about where we probably want to go next, to the likely occurrence of crime. By now, according to the literature, the legal, ethical, formal, and economic dimensions of algorithms are all quasi- infinite. I’d like to focus on one particular siren song of the algorithm: its promise of objectivity. Scientific objectivity has a history. That might seem surprising. Isn’t the notion—expressed above by the Minnesota psychologists—right? Isn’t objectivity co- extensive with science itself? Here it’s worth stepping back to reflect on all the epistemic virtues we might value in scientific work. Quantification seems like a good thing to have; so, too, do prediction, explanation, unification, precision, accuracy, certainty, and pedagogical utility. In the best of all possible worlds these epistemic virtues would all pull in the same direction. But they do not—not any more than our ethical virtues necessarily coincide. Rewarding people according to their need may very well conflict with rewarding people according to their ability. Equality, fairness, meritocracy—ethics, in a sense, is all about the adjudication of conflicting goods. Too often we forget that this conflict exists in science, too. Design an instrument to be as sensitive as possible and it often fluctuates wildly, making repetition of a measurement impossible. “Scientific objectivity” entered both the practice and the nomenclature of science after the first third of the 19th century. One sees this clearly in the scientific atlases that provided scientists with the basic objects of their specialty: There were (and are) atlases of the hand, atlases of the skull, atlases of clouds, crystals, flowers, bubble-chamber pictures, nuclear emulsions, and diseases of the eye. In the 18th century, it was obvious ‘43 TED Talk, January 2014, https://www.ted.com/speakers/anne_milgram. 161 HOUSE_OVERSIGHT_016964
that you would not depict this particular, sun-scorched, caterpillar-chewed clover found outside your house in an atlas. No, you aimed—if you were a genius natural philosopher like Goethe, Albinus, or Cheselden—to observe nature but then to perfect the object in question, to abstract it visually to the ideal. Take a skeleton, view it through a camera lucida, draw it with care. Then correct the “imperfections.” The advantage of this parting of the curtains of mere experience was clear: It provided a universal guide, one not attached to the vagaries of individual variation. As the sciences grew in scope, and scientists grew in number, the downside of idealization became clearer. It was one thing to have Goethe depict the “ur-plant” or “ur- insect.” It was quite another to have a myriad of different scientists each fixing their images in different and sometimes contradictory ways. Gradually, from around the 1830s forward, one begins to see something new: a claim that the image making was done with a minimum of human intervention, that protocols were followed. This could mean tracing a leaf with a pencil or pressing it into ink that was transferred to the page. It meant, too, that one suddenly was proud of depicting the view through a microscope of a natural object even with its imperfections. This was a radical idea: snowflakes shown without perfect hexagonal symmetry, color distortion near the edge of a microscope lens, tissue torn around the edges in the process of its preparation. Scientific objectivity came to mean that our representations of things were executed by holding back from intervention—even if it meant reproducing the yellow color near the edge of the image under the microscope, despite the fact that the scientist knew that the discoloration was from the lens, not a feature of the object of inquiry. The advantage of objectivity was clear: It superseded the desire to see a theory realized or a generally accepted view confirmed. But objectivity came at a cost. You lost that precise, easily teachable, colored, full depth-of-field, artist’s rendition of a dissected corpse. You got a blurry, bad depth-of-field, black-and-white photograph that no medical student (nor even many medical colleagues) could use to learn and compare cases. Still, for a long stretch of the 19th century, the virtue of hands-off, self-restraining objectivity was on the rise. Starting in the 1930s, the hardline scientific objectivity in scientific representation began running into trouble. In cataloging stellar spectra, for example, no algorithm could compete with highly trained observers who could sort them with far greater accuracy and replicability than any purely rule-following procedure. By the late 1940s, doctors had begun learning how to read electroencephalograms. Expert judgment was needed to sort out different kinds of seizure readings, while none of the early attempts to use frequency analysis could match that judgment. Solar magnetograms—mapping the magnetic fields across the sun—required the trained expert to pry the real signal from artifacts that emerged from the measuring instruments. Even particle physicists recognized that they could not program a computer to sort certain kinds of tracks into the right bins; judgment, trained judgment, was needed. There should be no confusion here: This was not a return to the invoked genius of an 18th-century idealizer. No one thought you could train to be a Goethe who alone among scientists could pick out the universal, ideal form of a plant, insect, or cloud. Expertise could be learned—you could take a course to learn to make expert judgments about electroencephalograms, stellar spectra, or bubble-chamber tracks; alas, no one has ever thought you could take a course that would lead to the mastery of exceptional 162 HOUSE_OVERSIGHT_016965
insight. There can be no royal road to becoming Goethe. In scientific atlas after scientific atlas, one sees explicit argument that “subjective” factors had to be part of the scientific work needed to create, classify, and interpret scientific images. What we see in so many of the algorists’ claims is a tremendous desire to find scientific objectivity precisely by abandoning judgment and relying on mechanical procedures—in the name of scientific objectivity. Many American states have legislated the use of sentencing and parole algorithms. Better a machine, it is argued, than the vagaries of a judge’s judgment. So here is a warning from the sciences. Hands-off algorithmic proceduralism did indeed have its heyday in the 19th century, and of course still plays a role in many of the most successful technical and scientific endeavors. But the idea that mechanical objectivity, construed as binding self-restraint, follows a simple, monotonic curve increasing from the bad impressionistic clinician to the good externalized actuary simply does not answer to the more interesting and nuanced history of the sciences. There is a more important lesson from the sciences. Mechanical objectivity is a scientific virtue among others, and the hard sciences learned that lesson often. We must do the same in the legal and social scientific domains. What happens, for example, when the secret, proprietary algorithm sends one person to prison for ten years and another for five years, for the same crime? Rebecca Wexler, visiting fellow at the Yale Law School Information Society Project, has explored that question, and the tremendous cost that trade-secret algorithms impose on the possibility of a fair legal defense.*4 Indeed, for a variety of reasons, law enforcement may not want to share the algorithms used to make DNA, chemical, or fingerprint identifications, which puts the defense in a much weakened position to make its case. In the courtroom, objectivity, trade secrets, and judicial transparency may pull in opposite directions. It reminds me of a moment in the history of physics. Just after World War II, the film giants Kodak and Ilford perfected a film that could be used to reveal the interactions and decays of elementary particles. The physicists were thrilled, of course—until the film companies told them that the composition of the film was a trade secret, so the scientists would never gain complete confidence that they understood the processes they were studying. Proving things with unopenable black boxes can be a dangerous game for scientists, and doubly so for criminal justice. Other critics have underscored how perilous it is to rely on an accused (or convicted) person’s address or other variables that can easily become, inside the black box of algorithmic sentencing, a proxy for race. By dint of everyday experience, we have grown used to the fact that airport security is different for children under the age of twelve and adults over the age of seventy-five. What factors do we want the algorists to have in their often hidden procedures? Education? Income? Employment history? What one has read, watched, visited, or bought? Prior contact with law enforcement? How do we want algorists to weight those factors? Predictive analytics predicated on mechanical objectivity comes at a price. Sometimes it may bea price worth paying; sometimes that price would be devastating for the just society we want to have. More generally, as the convergence of algorithms and Big Data governs a greater and greater part of our lives, it would be well worth keeping in mind these two lessons “4 Rebecca Wexler, “Life, Liberty, and Trade Secrets: Intellectual Property in the Criminal Justice System,” 70 Stanford Law Review, XXX (2018). 163 HOUSE_OVERSIGHT_016966
from the history of the sciences: Judgment is not the discarded husk of a now pure objectivity of self-restraint. And mechanical objectivity is a virtue competing among others, not the defining essence of the scientific enterprise. They are lessons to bear in mind, even if algorists dream of objectivity. 164 HOUSE_OVERSIGHT_016967
In the past decade, genetic engineering has caught up with computer science with regard to how new scientific initiatives are shaping our lives. Genetic engineer George Church, a pioneer of the revolution in reading and writing biology, is central to this new landscape of ideas. He thinks of the body as an operating system, with engineers taking the place of traditional biologists in retooling stripped-down components of organisms (from atoms to organs) in much the same vein as in the late 1970s, when electrical engineers were working their way to the first personal computer by assembling circuit boards, hard drives, monitors, etc. George created and is director of the Personal Genome Project, which provides the world’s only open-access information on human genomic, environmental, and trait data (GET) and sparked the growing DNA ancestry industry. He was instrumental in laying the groundwork for President Obama’s 2013 BRAIN (Brain Research through Advancing Innovative Neurotechnologies) Initiative—in aid of improving the brains of human beings to the point where, for much of what sustains us, we might not need the help of (potentially dicey) Als. “Tt could be that some of the BRAIN Initiative projects allow us to build human brains that are more consistent with our ethics and capable of doing advanced tasks like artificial intelligence,” George has said. “The safest path by far is getting humans to do all the tasks that they would like to delegate to machines, but we re not yet firmly on that super-safe path.” More recently, his crucially important pioneering use of the enzyme CRISPR (as well as methods better than CRISPR) to edit the genes of human cells is sometimes missed by the media in the telling of the CRISPR origins story. George's attitude toward future forms of artificial general intelligence is friendly, as evinced in the essay that follows. At the same time, he never loses sight of the AI- safety issue. On that subject, he recently remarked: “The main risk in AI, to my mind, is not so much whether we can mathematically understand what they ’re thinking; it’s whether we re capable of teaching them ethical behavior. We’re barely capable of teaching each other ethical behavior.” 165 HOUSE_OVERSIGHT_016968
THE RIGHTS OF MACHINES George M. Church George M. Church is Robert Winthrop Professor of Genetics at Harvard Medical School; Professor of Health Sciences and Technology, Harvard-MIT; and co-author (with Ed Regis) of Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves. In 1950, Norbert Wiener’s Zhe Human Use of Human Beings was at the cutting edge of vision and speculation in proclaiming that the machine like the djinnee, which can learn and can make decisions on the basis of its learning, will in no way be obliged to make such decisions as we should have made, or will be acceptable to us. ... Whether we entrust our decisions to machines of metal, or to those machines of flesh and blood which are bureaus and vast laboratories and armies and corporations, . . . [t]he hour is very late, and the choice of good and evil knocks at our door. But this was his book’s denouement, and it has left us hanging now for sixty-eight years, lacking not only prescriptions and proscriptions but even a well-articulated “problem statement.” We have since seen similar warnings about the threat of our machines, even in the form of outreach to the masses, via films like Colossus: The Forbin Project (1970), The Terminator (1984), The Matrix (1999), and Ex Machina (2015). But now the time is ripe for a major update, with fresh, new perspectives—notably focused on generalizations of our “human” rights and our existential needs. Concern has tended to focus on “us versus them [robots]” or “grey goo [nanotech]” or “monocultures of clones [bio].” To extrapolate current trends: What if we could make or grow almost anything and engineer any level of safety and efficacy desired? Any thinking being (made of any arrangement of atoms) could have access to any technology. Probably we should be less concerned about us-versus-them and more concerned about the rights of all sentients in the face of an emerging unprecedented diversity of minds. We should be harnessing this diversity to minimize global existential risks, like supervolcanoes and asteroids. But should we say “should”? (Disclaimer: In this and many other cases, when a technologist describes a societal path that “could,” “would,” or “should” happen, this doesn’t necessarily equate to the preferences of the author. It could reflect warning, uncertainty, and/or detached assessment.) Roboticist Gianmarco Veruggio and others have raised issues of roboethics since 2002; the U.K. Department of Trade and Industry and the RAND spin-off Institute for the Future have raised issues of robot rights since 2006. “Ts versus ought” It is commonplace to say that science concerns “is,” not “ought.” Stephen Jay Gould’s “non-overlapping magisteria” view argues that facts must be completely distinct from values. Similarly, the 1999 document Science and Creationism from the U.S. National Academy of Sciences noted that “science and religion occupy two separate realms.” This 166 HOUSE_OVERSIGHT_016969
division has been critiqued by evolutionary biologist Richard Dawkins, myself, and others. We can discuss “should” if framed as “we should do X in order to achieve Y.” Which Y should be a high priority is not necessarily settled by democratic vote but might be settled by Darwinian vote. Value systems and religions wax and wane, diversify, diverge, and merge just as living species do: subject to selection. The ultimate “value” (the “should”’) is survival of genes and memes. Few religions say that there is no connection between our physical being and the spiritual world. Miracles are documented. Conflicts between Church doctrine and Galileo and Darwin are eventually resolved. Faith and ethics are widespread in our species and can be studied using scientific methods, including but not limited to fMRI, psychoactive drugs, questionnaires, et cetera. Very practically, we have to address the ethical rules that should be built in, learned, or probabilistically chosen for increasingly intelligent and diverse machines. We have a whole series of trolley problems. At what number of people in line for death should the computer decide to shift a moving trolley to one person? Ultimately this might be a deep-learning problem—one in which huge databases of facts and contingencies can be taken into account, some seemingly far from the ethics at hand. For example, the computer might infer that the person who would escape death if the trolley is left alone is a convicted terrorist recidivist loaded up with doomsday pathogens, or a saintly POTUS—or part of a much more elaborate chain of events in detailed alternative realities. If one of these problem descriptions seems paradoxical or illogical, it may be that the authors of the trolley problem have adjusted the weights on each sides of the balance such that hesitant indecision is inevitable. Alternatively, one can use misdirection to rig the system, such that the error modes are not at the level of attention. For example, in the Trolley Problem, the real ethical decision was made years earlier when pedestrians were given access to the rails— or even before that, when we voted to spend more on entertainment than on public safety. Questions that at first seem alien and troubling, like “Who owns the new minds, and who pays for their mistakes?” are similar to well-established laws about who owns and pays for the sins of a corporation. The Slippery Slopes We can (over)simplify ethics by claiming that certain scenarios won’t happen. The technical challenges or the bright red lines that cannot be crossed are reassuring, but the reality is that once the benefits seem to outweigh the risks (even briefly and barely), the red lines shift. Just before Louise Brown’s birth in 1978, many people were worried that she “would turn out to be a little monster, in some way, shape or form, deformed, something wrong with her.”*° Few would hold this view of in-vitro fertilization today. What technologies are lubricating the slope toward multiplex sentience? It is not merely deep machine-learning algorithms with Big Iron. We have engineered rodents to be significantly better at a variety of cognitive tasks as well as to exhibit other relevant traits, such as persistence and low anxiety. Will this be applicable to animals that are already at the door of humanlike intelligence? Several show self-recognition in a mirror test—chimpanzees, bonobos, orangutans, some dolphins and whales, and magpies. 45 “Then, Doctors ‘All Anxious’ About Test-tube Baby” http://edition.cnn.com/2003/HEALTH/parenting/07/25/cnna.copperman/ 167 HOUSE_OVERSIGHT_016970
Even the bright red line for human manipulation of human beings shows many signs of moving or breaking completely. More than 2,300 approved clinical trials for gene therapy are in progress worldwide. A major medical goal is the treatment or prevention of cognitive decline, especially in light of our rapidly aging global demographic. Some treatments of cognitive decline will include cognitive enhancements (drugs, genes, cells, transplants, implants, and so on). These will be used off-label. The rules of athletic competition (e.g., banning augmentation with steroids or erythropoietin) do not apply to intellectual competition in the real world. Every bit of progress on cognitive decline is in play for off-label use. Another frontier of the human use of humans is “brain organoids.” We can now accelerate developmental biology. Processes that normally take months can happen in four days in the lab using the right recipes of transcription factors. We can make brains that, with increasing fidelity, recapitulate the differences between people born with aberrant cognitive abilities (e.g., microcephaly). Proper vasculature (veins, arteries, and capillaries) missing from earlier successes are now added, enabling brain organoids to surpass the former sub-microliter limit to possibly exceed the 1.2-liter size of modern human brains (or even the 5-liter elephant or 8-liter sperm whale brains). Conventional Computers versus Bio-electronic Hybrids As Moore’s Law miniaturization approaches its next speed bump (surely not a solid wall), we see the limits of the stochastics of dopant atoms in silicon slabs and the limits of beam-fabrication methods at around 10-nanometer feature size. Power (energy consumption) issues are also apparent: The great Watson, winner of Jeopardy!, used 85,000 watts real time, while the human brains were using 20 watts each. To be fair, the human body needs 100 watts to operate and twenty years to build, hence about 6 trillion joules of energy to “manufacture” a mature human brain. The cost of manufacturing Watson-scale computing is similar. So why aren’t humans displacing computers? For one, the Jeopardy! contestants’ brains were doing far more than information retrieval—much of which would be considered mere distractions by Watson (e.g., cerebellar control of smiling). Other parts allow leaping out of the box with transcendence unfathomable by Watson, such as what we see in Einstein’s five annus mirabilis papers of 1905. Also, humans consume more energy than the minimum (100 W) required for life and reproduction. People in India use an average of 700 W per person; it’s 10,000 W in the U.S. Both are still less than the 85,000 watts Watson uses. Computers can become more like us via neuromorphic computing, possibly a thousandfold. But human brains could get more efficient, too. The organoid brain-in-a- bottle could get closer to the 20 W limit. The idiosyncratic advantages of computers for math, storage, and search, faculties of limited use to our ancestors, could be designed and evolved anew in labs. Facebook, the National Security Agency, and others are constructing exabyte- scale storage facilities at more than a megawatt and four hectares, while DNA can store that amount in a milligram. Clearly, DNA is not a mature storage technology, but with Microsoft and Technicolor doubling down on it, we would be wise to pay attention. The main reason for the 6 trillion joules of energy required to get a productive human mind is the twenty years required for training. 168 HOUSE_OVERSIGHT_016971
Even though a supercomputer can “train” a clone of zemself in seconds, the energy cost of producing a mature silicon clone is comparable. Engineering (Homo) prodigies might make a small impact on this slow process, but speeding up development and implanting extensive memory (as DNA-exabytes or other means) could reduce duplication time of a bio-computer to close to the doubling time of cells (ranging from eleven minutes to twenty-four hours). The point is that while we may not know what ratio of bio/homo/nano/robo hybrids will be dominant at each step of our accelerating evolution, we can aim for high levels of humane, fair, and safe treatment (“use”) of one another. Bills of Rights date back to 1689 in England. FDR proclaimed the “Four Freedoms”’—freedom of speech, freedom of conscience, freedom from fear, and freedom from want. The U.N.’s Universal Declaration of Human Rights in 1948 included the right to life; the prohibition of slavery; defense of rights when violated; freedom of movement; freedom of association, thought, conscience, and religion; social, economic, and cultural rights; duties of the individual to society; and prohibition of use of rights in contravention of the purposes and principles of the United Nations. The “universal” nature of these rights is not universally embraced and is subject to extensive critique and noncompliance. How does the emergence of non-Homo- intelligences affect this discussion? At a minimum, it is becoming rapidly difficult to hide behind vague intuition for ethical decisions—“I know it when I see it” (U.S. Supreme Court Justice Potter Stewart, 1964) or the “wisdom of repugnance” (aka “yuck factor,” Leon Kass, 1997), or vague appeals to “common sense.” As we have to deal with minds alien to us, sometimes quite literal from our viewpoint, we need to be explicit—yea, even algorithmic. Self-driving cars, drones, stock-market transactions, NSA searches, et cetera, require rapid, pre-approved decision making. We may gain insights into many aspects of ethics that we have been trying to pin down and explain for centuries. The challenges have included conflicting priorities, as well as engrained biological, sociological, and semi-logical cognitive biases. Notably far from consensus in universal dogmas about human rights are notions of privacy and dignity, even though these influence many laws and guidelines. Humans might want the right to march in to read (and change) the minds of computers to see why they’re making decisions at odds with our (Homo) instincts. Is it not fair for machines to ask the same of us? We note the growth of movements toward transparency in potential financial conflicts; “open-source” software, hardware, and wetware; the Fair Access to Science and Technology Research Act (FASTR); and the Open Humans Foundation. In his 1976 book Computer Power and Human Reason, Joseph Weizenbaum argued that machines should not replace Homo in situations requiring respect, dignity, or care, while others (author Pamela McCorduck and computer scientists like John McCarthy and Bill Hibbard) replied that machines can be more impartial, calm, and consistent and less abusive or mischievous than people in such positions. Equality What did the thirty-three-year-old Thomas Jefferson mean in 1776 when he wrote, “We hold these Truths to be self-evident, that all Men are created equal, that they are endowed 169 HOUSE_OVERSIGHT_016972
by their Creator with certain unalienable Rights, that among these are Life, Liberty, and the Pursuit of Happiness”? The spectrum of current humans 1s vast. In 1776, “Men” did not include people of color or women. Even today, humans born with congenital cognitive or behavioral issues are destined for unequal (albeit in most cases compassionate) treatment—Down syndrome, Tay-Sachs disease, Fragile X syndrome, cerebral palsy, and so on. And as we change geographical location and mature, our unequal rights change dramatically. Embryos, infants, children, teens, adults, patients, felons, gender identities and gender preferences, the very rich and very poor—all of these face different rights and socioeconomic realities. One path to new mind-types obtaining and retaining rights similar to the most elite humans would be to keep a Homo component, like a human shield or figurehead monarch/CEO, signing blindly enormous technical documents, making snap financial, health, diplomatic, military, or security decisions. We will probably have great difficulty pulling the plug, modifying, or erasing (killing) a computer and its memories—especially if it has befriended humans and made spectacularly compelling pleas for survival (as all excellent researchers fighting for their lives would do). Even Scott Adams, creator of Di/bert, has weighed in on this topic, supported by experiments at Eindhoven University in 2005 noting how susceptible humans are to a robot-as-victim equivalent of the Milgram experiments done at Yale beginning in 1961. Given the many rights of corporations, including ownership of property, it seems likely that other machines will obtain similar rights, and it will be a struggle to maintain inequities of selective rights along multi-axis gradients of intellect and ersatz feelings. Radically Divergent Rules for Humans versus Nonhumans and Hybrids The divide noted above for intra Homo sapiens variation in rights explodes into a riot of inequality as soon as we move to entities that overlap (or will soon) the spectrum of humanity. In Google Street View, people’s faces and car license plates are blurred out. Video devices are excluded from many settings, such as courts and committee meetings. Wearable and public cameras with facial-recognition software touch taboos. Should people with hyperthymesia or photographic memories be excluded from those same settings? Shouldn’t people with prosopagnosia (face blindness) or forgetfulness be able to benefit from facial-recognition software and optical character recognition wherever they go, and if them, then why not everyone? If we all have those tools to some extent, shouldn’t we all be able to benefit? These scenarios echo Kurt Vonnegut’s 1961 short story “Harrison Bergeron,” in which exceptional aptitude is suppressed in deference to the mediocre lowest common denominator of society. Thought experiments like John Searle’s Chinese Room and Isaac Asimov’s Three Laws of Robotics all appeal to the sorts of intuitions plaguing human brains that Daniel Kahneman, Amos Tversky, and others have demonstrated. The Chinese Room experiment posits that a mind composed of mechanical and Homo sapiens parts cannot be conscious, no matter how competent at intelligent human (Chinese) conversation, unless a human can identify the source of the consciousness and “feel” it. Enforced preference for Asimov’s First and Second Laws favor human minds over any other mind meekly present in his Third Law, of self-preservation. 170 HOUSE_OVERSIGHT_016973
If robots don’t have exactly the same consciousness as humans, then this is used as an excuse to give them different rights, analogous to arguments that other tribes or races are less than human. Do robots already show free will? Are they already self- conscious? The robots Qbo have passed the “mirror test” for self-recognition and the robots NAO have passed a related test of recognizing their own voice and inferring their internal state of being, mute or not. For free will, we have algorithms that are neither fully deterministic nor random but aimed at nearly optimal probabilistic decision making. One could argue that this is a practical Darwinian consequence of game theory. For many (not all) games/problems, if we’re totally predictable or totally random, then we tend to lose. What is the appeal of free will anyway? Historically it gave us a way to assign blame in the context of reward and punishment on Earth or in the afterlife. The goals of punishment might include nudging the priorities of the individual to assist the survival of the species. In extreme cases, this could include imprisonment or other restrictions, if Skinnerian positive/negative reinforcement is inadequate to protect society. Clearly, such tools can apply to free will, seen broadly—to any machine whose behavior we’d like to manage. We could argue as to whether the robot actually experiences subjective qualia for free will or self-consciousness, but the same applies to evaluating a human. How do we know that a sociopath, a coma patient, a person with Williams syndrome, or a baby has the same free will or self-consciousness as our own? And what does it matter, practically? If humans (of any sort) convincingly claim to experience consciousness, pain, faith, happiness, ambition, and/or utility to society, should we deny them rights because their hypothetical qualia are hypothetically different from ours? The sharp red lines of prohibition, over which we supposedly will never step, increasingly seem to be short-lived and not sensible. The line between human and machines blurs, both because machines become more humanlike and humans become more machine-like—not only since we increasingly blindly follow GPS scripts, reflex tweets, and carefully crafted marketing, but also as we digest ever more insights into our brain and genetic programming mechanisms. The NIH BRAIN Initiative is developing innovative technologies and using these to map out the connections and activity of mental circuitry so as to improve electronic and synthetic neurobiological ware. Various red lines depend on genetic exceptionalism, in which genetics 1s considered permanently heritable (although it is provably reversible), whereas exempt (and lethal) technologies, like cars, are for all intents and purposes irreversible due to social and economic forces. Within genetics, a red line makes us ban or avoid genetically modified foods but embrace genetically modified bacteria making insulin, or genetically modified humans—witness mitochondrial therapies approved in Europe for human adults and embryos. The line for germline manipulation seems less sensible than the usual, practical line drawn at safety and efficacy. Marriages of two healthy carriers of the same genetic disease have a choice between no child of their own, 25-percent loss of embryos via abortion (spontaneous or induced), 80-percent loss via in-vitro fertilization, or potential zero-percent embryo loss via sperm (germline) engineering. It seems premature to declare this last option unlikely. 171 HOUSE_OVERSIGHT_016974
For “human subject research,” we refer to the 1964 Declaration of Helsinki, keeping in mind the 1932-1972 Tuskegee syphilis experiment, possibly the most infamous biomedical research study in U.S. history. In 2015, the Nonhuman Rights Project filed a lawsuit with the New York State Supreme Court on behalf of two chimpanzees kept for research by Stony Brook University. The appellate court decision was that chimps are not to be treated as legal persons since they “do not have duties and responsibilities in society,” despite Jane Goodall’s and others’ claim that they do, and despite arguments that such a decision could be applied to children and the disabled.*° What prevents extension to other animals, organoids, machines, and hybrids? As we (e.g., Hawking, Musk, Tallinn, Wilczek, Tegmark) have promoted bans on “autonomous weapons,” we have demonized one type of “dumb” machine, while other machines—for instance, those composed of many Homo sapiens voting—can be more lethal and more misguided. Do transhumans roam the Earth already? Consider the “uncontacted peoples,” such as the Sentinelese and Andamanese of India, the Korowai of Indonesia, the Mashco- Piro of Peru, the Pintupi of Australia, the Surma of Ethiopia, the Ruc of Vietnam, the Ayoreo-Totobiegosode of Paraguay, the Himba of Namibia, and dozens of tribes in Papua New Guinea. How would they or our ancestors respond? We could define “transhuman” as people and culture not comprehensible to humans living in a modern, yet un-technological culture. Such modern Stone Age people would have great trouble understanding why we celebrate the recent LIGO gravity-wave evidence supporting the hundred-year-old general theory of relativity. They would scratch their heads as to why we have atomic clocks, or GPS satellites so we can find our way home, or why and how we have expanded our vision from a narrow optical band to the full spectrum from radio to gamma. We can move faster than any other living species; indeed, we can reach escape velocity from Earth and survive in the very cold vacuum of space. If those characteristics (and hundreds more) don’t constitute transhumanism, then what would? If we feel that the judge of transhumanism should not be fully paleo-culture humans but recent humans, then how would we ever reach transhuman status? We “recent humans” may always be capable of comprehending each new technological increment—never adequately surprised to declare arrival at a (moving) transhuman target. The science-fiction prophet William Gibson said, “The future is already here— it’s just not very evenly distributed.” While this underestimates the next round of “future,” certainly millions of us are transhuman already—with most of us asking for more. The question “What was a human?” has already transmogrified into “What were the many kinds of transhumans?. .. And what were their rights?” “@ hittps:/Avww.nbcnews.com/news/us-news/lawyer-denying-chimpanzees-rights-could-backfire-disabled- n734566. 172 HOUSE_OVERSIGHT_016975
Caroline A. Jones’ interest in modern and contemporary art is enriched by a willingness to delve into the technologies involved in its production, distribution, and reception. “As an art historian, a lot of my questions are about what kind of art we can make, what kind of thought we can make, what kind of ideas we can make that could stretch the human beyond our stubborn, selfish, ‘only concerned with our small group’ parameters. The Philosophers and philosophies I’m drawn to are those that question the Western obsession with individualism. Those are coming from so many different places, and they re reviving so many different kinds of questions and problems that were raised in the 1960s.” She has recently turned her attention to the history of cybernetics. Her MIT course, “Automata, Automatism, Systems, Cybernetics,” explores the history of the human/machine interface in terms of feedback, exploring the cultural rather than engineering uptake of this idea. She begins with primary readings by Wiener, Shannon, and Turing and then pivots from the scientists and engineers to the work and ideas of artists, feminists, postmodern theorists. Her goal: to come up with a new central paradigm of evolution that’s culture-based—“communalism and interspecies symbiosis rather than survival of the fittest.” As a historian, Caroline draws a distinction between what she has termed “left cybernetics” and “right cybernetics”: “What do I mean by left cybernetics? In one sense, it’s a pun or a joke: the cybernetics that was ‘left’ behind. On another level, it’s a vague political grouping connoting our Left Coast: California, Esalen, the group that Dave Kaiser calls the ‘hippie physicists.’ It’s not an adequate term, but it’s away of recognizing that there was a group beholden to the military-industrial complex, sometimes very unhappily, who gave us the tools to critique it.” 173 HOUSE_OVERSIGHT_016976
THE ARTISTIC USE OF CYBERNETIC BEINGS Caroline A. Jones Caroline A. Jones is a professor of art history in the Department of Architecture at MIT and author of Eyesight Alone: Clement Greenberg’s Modernism and the Bureaucratization of the Senses; Machine in the Studio: Constructing the Postwar American Artist; and The Global Work of Art. Cybernated art is very important, but art for cybernated life is more important. — Nam June Paik, 1966 Artificial intelligence was not what artists first wanted out of cybernetics, once Norbert Wiener’s Zhe Human Use of Human Beings: Cybernetics and Society came out in 1950. The range of artists who identified themselves with cybernetics in the fifties and sixties initially had little access to “thinking machines.” Moreover, craft-minded engineers had already been making turtles, jugglers, and light-seeking robot babes, not giant brains. Using breadboards, copper wire, simple switches, and electronic sensors, artists followed cyberneticians in making sculptures and environments that simulated interactive sentience—analog movements and interfaces that had more to do with instinctive drives and postwar sexual politics than the automation of knowledge production. Now obscured by an ideology of a free-floating “intelligence” untethered by either hardware or flesh, AI has forgotten the early days of cybernetics’ uptake by artists. Those efforts are worth revisiting; they modeled relations with what the French philosophers Gilles Deleuze and Félix Guattari have called the “machinic phylum,” having to do with how humans think and feel in bodies engaged with a physical, material, emotionally stimulating, and signaling world. Cybernetics now seems to have collapsed into an all-pervasive discourse of AI that was far from preordained. “Cybernetics,” as a word, claimed postwar newness for concepts that were easily four centuries old: notions of feedback, machine damping, biological homeostasis, logical calculation, and systems thinking that had been around since the Enlightenment (boosted by the Industrial Revolution). The names in this lineage include Descartes, Leibniz, Sadi Carnot, Clausius, Maxwell, and Watt. Wiener’s coinage nonetheless had profound cultural effects.4”7 The ubiquity today of the prefix “cyber-” confirms the desire for a crisp signifier of the tangled relations between humans and machines. In Wiener’s usage, things “cyber” simply involved “control and communication in the animal and the machine.” But after the digital revolution, “cyber” moved beyond servomechanisms, feedback loops, and switches to encompass software, algorithms, and cyborgs. The work of cybernetically inclined artists concerns the emergent behaviors of life that elude AI in its current condition. As to that original coinage, Wiener had reached back to the ancient Greek to borrow the word for “steersman” (kvBepvytns / kubernétés), a masculine figure channeling power and instinct at the helm of a ship, who read the waves, judged the wind, kept a hand on the tiller, and directed the slaves as they mindlessly (mechanically) churned their oars. The Greek had already migrated into modern English via Latin, going 47 Wiener later had to admit the earlier coinage of the word in 1834 by André-Marie Ampére, who had intended it to mean the “science of government,” a concept that remained dormant until the 20th century. 174 HOUSE_OVERSIGHT_016977
from kuber- to guber—the root of “gubernatorial” and “governor,” another term for masculine control, deployed by James Watt to describe his 19th-century device for modulating a runaway steam engine. Cybernetics thus took ideas that had long analogized people and devices and generalized them to an applied science by adding that “-ics.” Wiener’s three c’s (command, control, communication) drew on the mathematics of probability to formalize systems (whether biological or mechanical) theorized as a set of inputs of information achieving outputs of actions in an environment—a muscular, fleshy agenda often minimized in genealogies of AI. But the etymology does little to capture the excitement felt by participants, as mathematics joined theoretical biology (Arturo Rosenblueth) and information theory (Claude Shannon, Walter Pitts, Warren McCulloch) to produce a barrage of interdisciplinary research and publications viewed as changing not just the way science was done but the way future humans would engage with the technosphere. As Wiener put it, “We have modified our environment so radically that we must now modify ourselves in order to exist.”48 The pressing question is: How are we modifying ourselves? Are we going in the right direction or have we lost our way, becoming the tools of our tools? Revisiting the early history of humanist/artists’ contribution to cybernetics may help direct us toward a less perilous, more ethical future. The year 1968 was a high-water mark of the cultural diffusion and artistic uptake of the term. In that year, the Howard Wise gallery opened its show of Wen-Ying Tsai’s “Cybernetic Sculpture” in midtown Manhattan, and Polish émigré Jasia Reichardt opened her exhibition “Cybernetic Serendipity” at London’s ICA. (The “Cybernetic” in her title was intended to evoke “made by or with computers,” even though most of the artworks on view had no computers, as such, in their responsive circuits.) The two decades between 1948 and 1968 had seen both the fanning out of cybernetic concepts into a broader culture and the spread of computation machines themselves in a slow migration from proprietary military equipment, through the multinational corporation, to the academic lab, where access began to be granted to artists. The availability of cybernetic components—“sensor organs” (electronic eyes, motion sensors, microphones) and “effector organs” (electronic “breadboards,” switches, hydraulics, pneumatics)—on the home hobbyist front rendered the computer less an “electronic brain” than an adjunct organ in a kit of parts. There was not yet a ruling metaphor of “artificial intelligence.” So artists were bricoleurs of electronic bodies, interested in actions rather than calculation or cognition. There were inklings of “computer” as calculator in the drive toward Homo rationalis, but more in aspiration than achievement. In light of today’s digital convergence in art/science imaging tools, Reichardt’s show was prophetic in its insistence on confusing the boundaries between art and what we might dub “creative applied science.” According to the catalog, “no visitor to the exhibition, unless he reads all the notes relating to all the works, will know whether he is looking at something made by an artist, engineer, mathematician, or architect.” So the comically dysfunctional robot by Nam June Paik, Robot K-456 (1964), featured on the catalog’s cover and described as “a female robot known for her disturbing and idiosyncratic behavior,” would face off against a balletic Colloquy of Mobiles (1968) from second-order cybernetician Gordon Pask. Pask worked with a London theater 48 The Human Use of Human Beings (1954 edition), p. 46. 175 HOUSE_OVERSIGHT_016978
designer to craft a spindly “male” apparatus of hinges and rods, set up to communicate with bulbous “female” fiberglass entities nearby. Whether anyone could actually map the quiddities of the program (or glean its reactionary gender theater) without reading the catalog essay is an open question. What is significant is Pask’s focus on the behaviors of his automata, their interactivity, their responsiveness within an artificially modulated environment, and their “reflection” of human behaviors. The ICA’s “Cybernetic Serendipity” introduced an important paradigm: the machinic ecosystem, in which the viewer was a biological part, tasked with figuring out just what the triggers for interaction might be. The visitors in those London galleries suddenly became “cybernetic organisms’”—cyborgs—since to experience the art adequately, one needed to enter a kind of symbiotic colloquy with the servomechanisms. This turn toward human-machine interactive environments as an aesthetic becomes clearer when we examine a few other artworks from the period, beginning with one constituting an early instance of emergent behavior—Senster, the interactive sculpture by artist/engineer Edward Ihnatowicz (1970), celebrated by medical robotics engineer Alex Zivanovic, editor of a Web site devoted to Ihnatowicz’s little-known career, as “one of the first computer controlled interactive robotic works of art.” Here, “the computer” makes its entry (albeit a twelve-bit, limited device). But rather than “intelligence,” Thnatowicz sought to make an avatar of affective behavior. Key to Senster ’s uncanny success was the programming with which Ihnatowicz constrained the fifteen-foot-long hydraulic apparatus (its hinge design and looming appearance inspired by a lobster claw) to convey shyness in responding to humans in its proximity. Sensfer’s sound channels and motion sensors were set to recoil at loud noises and sudden aggressive movements. Only those humans willing to speak softly and modulate their gestures would be rewarded by Senster’s quiet, inquisitive approach—an experience that became real for Thnatowicz himself when he first assembled the program and the machine turned to him solicitously after he’d cleared his throat. In these artistic uses of cybernetic beings, we sense a growing necessity to train the public to experience itself as embedded in a technologized environment, modifying itself to communicate intuitively with machines. This necessity had already become explicit in Tsai’s “Cybernetic Sculpture” show. Those experiencing his immersive installation were expected to experiment with machinic life: What behaviors would trigger the servomechanisms? Likely, the human gallery attendant would have had to explain the protocol: “Clap your hands—that gets the sculptures to respond.” As an early critic described it: A grove of slender stainless-steel rods rises from a plate. This base vibrates at 30 cycles per second; the rods flex rapidly, in harmonic curves. Set in a dark room, they are lit by strobes. The pulse of the flashing lights varies—they are connected to sound and proximity sensors. The result is that when one approaches a Tsai or makes a noise in its vicinity, the thing responds. The rods appear to move; there is a shimmering, a flashing, an eerie ballet of metal, whose apparent movements range from stillness to jittering and back to a slow, indescribably sensuous undulation.” “° Robert Hughes, Zime magazine (October 2, 1972) review of Tsai exhibition at Denise René gallery. 176 HOUSE_OVERSIGHT_016979
Like Senster, the apparatus stimulated (and simulated) an affective rather than rational interaction. Humans felt they were encountering behaviors indicative of responsive life; Tsai’s entities were often classed as “vegetal” or “aquatic.” Such environmental and kinetic ambitions were widespread in the international art world of the time. Beyond the stable at Howard Wise, there were the émigrés forming the collective GRAV in Paris, the “cybernetic architectures” of Nicolas Schoffer, the light and plastic gyrations of the German Zero Gruppe, and so on—all defining and informing the genre of installation art to come. The artistic use of cybernetic beings in the late sixties made no investment in “intelligence.” Knowing machines were dumb and incapable of emotion, these creators were confident in staging frank simulations. What interested them were machinic motions evoking drives, instincts, and affects; they mimicked sexual and animal behaviors, as if below the threshold of consciousness. Such artists were uninterested in the manipulation of data or information (although Hans Haacke would move in that direction by 1972 with his “Real-Time Systems” works). The cybernetic culture that artists and scientists were putting in place on two continents embedded the human in the technosphere and seduced perception with the graceful and responsive behaviors of the machinic phylum. “Artificial” and “natural” intertwined in this early cybernetic aesthetic. But it wouldn’t end here. Crucial to the expansion of this uncritical, largely masculine set of cybernetic environments would be a radical, critical cohort of astonishing women artists emerging in the 1990s, fully aware of their predecessors in art and technology but perhaps more inspired by the feminist founders of the 1970 journal Radical Software and the cultural blast of Donna Haraway’s inspiring 1984 polemic, “A Cyborg Manifesto.” The creaky gender theater of Paik and Pask, the innocent creatures of Ihnatowicz and Tsai, were mobilized as savvy, performative, and postmodern, as in Lynn Hershman Leeson’s Dollie Clone Series (1995-98) consisting of the interactive assemblages CyberRoberta and Tillie, the Telerobotic Doll, who worked the technosphere with the professionalism of burlesque, winking and folding us viewers into an explicit consciousness of our voyeuristic position as both seeing subjects and objects- to-be-looked-at. The “innocent” technosphere established by male cybernetic sculptors of the 1960s was, by the 1990s, identified by feminist artists as an entirely suffusive condition demanding our critical attention. At the same time, feminists tackled the question of whose “intelligence” AI was attempting to simulate. For an artist such as Hershman Leeson, responding to the technical “triumph” of cloning Dolly the sheep, it was crucial to draw the connection between meat production and “meat machines.” Hershman Leeson produced “dolls” as clones, offering a critical framing of the way contemporary individuation had become part of an ideological, replicative, plastic realm. While the technofeminists of the 1990s and into the 2000s weren’t all cyber all the time, their works nonetheless complicated the dominant machinic and kinetic qualities of male artists’ previous techno-environments. The androgynous tele-cyborg in Judith Barry’s Imagination, Dead Imagine (1991), for example, had no moving parts: He/she was comprised of pure signals, flickering projections on flat surfaces. In her setup, Barry commented on the alienating effects of late-20th-century technology. The image of an androgynous head fills an enormous cube made of ten-foot-square screens on 177 HOUSE_OVERSIGHT_016980
five sides, mounted on a ten-foot-wide mirrored base. A variety of viscous and unpleasant-looking fluids (yellow, reddish-orange, brown), dry materials (sawdust? flour?), and even insects drizzle or dust their way down the head, whose stoic sublimity is made gorgeously virtual on the work’s enormous screens. Dead Imagine, through its large-scale and cubic “Platonic” form, remains both artificial and locked into the body— refusing a detached “intelligence” as being no intelligence at all. Artists in the new millennium inherit this critical tradition and inhabit the current paradigms of AI, which has slid from partial simulations to claims of intelligence. In the 1955 proposal thought to be the first printed usage of the phrase “artificial intelligence,” computer scientist John McCarthy and his colleagues Marvin Minsky, Nathaniel Rochester, and Claude Shannon conjectured that “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.” This modest theoretical goal has inflated over the past sixty-four years and is now expressed by Google DeepMind as an ambition to “Solve intelligence.” Crack the code! But unfortunately, what we hear cracking is not code but small-scale capitalism, the social contract, and the scaffolding of civility. Taking away the jobs of taxi and truck drivers, roboticizing direct marketing, hegemonizing entertainment, privatizing utilities, and depersonalizing health care—are these the “whips” that Wiener feared we would learn to love? Artists can’t solve any of this. But they can remind us of the creative potential of the paths not taken—the forks in the road that were emerging around 1970, before “information” became capital and “intelligence” equaled data harvesting. Richly evocative of what can be done with contemporary tools when revisiting earlier possibilities is French artist Philippe Parreno’s “firefly piece,” so nicknamed to avoid having to iterate its actual title: With a Rhythmic Instinction to Be Able to Travel Beyond Existing Forces of Life (2014). Described by the artist as “an automaton,” the sculptural installation juxtaposes a flickering projection of black-and-white drawings of fireflies with a band of oscillating green-on-black binary figures. The drawings and binary figures are animated using algorithms from mathematician John Horton Conway’s 1970 Game of Life, a “cellular automaton.” Conway set up parameters for any square (“cell”) to be lit (“alive”) or dark (“dead”) in an infinite, two-dimensional grid. The rules are summarized as follows: A single cell will quickly die of loneliness. But a cell touching three or more other “live” cells will also die, “due to crowding.” A cell survives and thrives if it has just two neighbors... and soon. As one cell dies, it may create the conditions for other cells to survive, yielding patterns that appear to move and grow, shifting across the grid like evanescent neural impulses or bioluminescent clusters of diatoms. In Stephen Hawking’s 2012 film The Meaning of Life, the narrator describes Conway’s mathematical model as simulating “how a complex thing like the mind might come about from a basic set of rules,” revealing the overweening ambitions that characterize contemporary AI: “[T]hese complex properties emerge from simple laws that contain no concepts like movement or reproduction,” yet they produce “species,” and cells “can even reproduce, just as life does in the real world.” °° Just as life does? Artists know the blandishments of simulation and representation, the difference between the genius of artifice and the realities of what “life °° Narration in Stephen Hawking’s The Meaning of Life (Smithson Productions, Discovery Channel, 2012). 178 HOUSE_OVERSIGHT_016981
does.” Parreno’s piece is an intuitive assembly of our experience of “life” through embodied, perspectival engagement. Our consciousness is electrically (cybernetically) enmeshed, yet we don’t respond as if this human-generated set of elegant simulations had its own intelligence. The artistic use of cybernetic beings also reminds us that consciousness itself is not just “in here.” It is streaming in and out, harmonizing those sensory, scintillating signals. Mind happens well outside the limits of the cranium (and its simulacrum, the “motherboard”). In Mary Catherine Bateson’s paraphrase of her father Gregory’s second-order cybernetics, mind is material “not necessarily defined by a boundary such as an envelope of skin.”°! Parreno pairs the simulations of art with the simulations of mathematics to force the Wiener-like point that any such model is not, by itself, just like life. Models are just that—parts of signaling systems constituting “intelligence” only when their creaturely counterparts engage them in lively meaning making. Contemporary AI has talked itself into a corner by instrumentalizing and particularizing tasks and subroutines, confusing these drills with actual wisdom. The brief cultural history offered here reminds us that views of data as intelligence, digital nets as “neural,” or isolated individuals as units of life, were alien even to Conway’s brute simulation. We can stigmatize the stubborn arrogance of current AI as “right cybernetics,” the path that led to current automated weapons systems, Uber’s ill-disguised hostility to human workers, and the capitalist dreams of Google. Now we must turn back to left cybernetics—theoretical biologists and anthropologists engaged with a trans-species understanding of intelligent systems. Gregory Bateson’s observation that corporations merely simulate “aggregates of parts of persons,” with profit-maximizing decisions cut off from “wider and wiser parts of the mind,” has never been more timely.*” The cybernetic epistemology offered here suggests a new approach. The individual mind is immanent, not only in the body but also in pathways outside the body, and there is a larger Mind, of which the individual mind is only a subsystem. This larger Mind, Bateson holds, is comparable to God, and is perhaps what some people mean by “God,” but it is still immanent in the total interconnected social system and planetary ecology. This is not the collective delusion of an exterior “God” who speaks from outside human consciousness (this long-seated monotheistic conceit, Bateson suggests, leads to views of nature and environment as also outside the “individual” human, rendering them as “gifts to exploit”). Rather, Bateson’s “God” is a placeholder for our evanescent experience of interacting consciousness-in-the-world: larger Mind as a result of inputs and actions that then become inputs for other actions in concert with other entities—webs of symbiotic relationships that form patterns we need urgently to sense and harmonize with.°? From Tsai in the 1970s to Hershman Leeson in the 1990s to Parreno in 2014, artists have been critiquing right cybernetics and plying alternative, embodied, environmental experiences of “artificial” intelligence. Their artistic use of cybernetic beings offers the wisdom of symbionts experienced in the kinds of poeisis that can be achieved in this world: rhythms of signals and intuitive actions that produce the 5! Mary Catherine Bateson, 1999 foreword to Gregory Bateson, Steps to an Ecology of Mind (Chicago: University of Chicago Press, 1972): xi. %? Steps to an Ecology of Mind, p. 452. 3 Tbid., pp. 467-8. 179 HOUSE_OVERSIGHT_016982
movements of life partnered with an electro-mechanical and -magnetic technosphere. Life, in its mysterious negentropic entanglements with matter and Mind. 180 HOUSE_OVERSIGHT_016983
Over nearly four decades, Stephen Wolfram has been a pioneer in the development and application of computational thinking and responsible for many innovations in science, technology and business. His 1982 paper “Cellular Automata as Simple Self-Organizing Systems,” written at the age of twenty-three, was the first of numerous significant scientific contributions aimed at understanding the origins of complexity in nature. It was around this time that Stephen briefly came into my life. I had established The Reality Club, an informal gathering of intellectuals who met in New York City to present their work before peers in other disciplines. (Note: In 1996, The Reality Club went online as Edge.org). Our first speaker? Stephen Wolfram, a “wunderkind” who had arrived in Princeton at the Institute for Advanced Study. I distinctly recall his focused manner as he sat down on a couch in my living room and spoke uninterrupted for about an hour before the assembled group. Since that time, Stephen has become intent making the world’s knowledge easily computable and accessible. His program Mathematica is the definitive system for modern technical computing. Wolfram|Alpha computes expert-level answers using AI technology. He considers his Wolfram Language to be the first true computational communication language for humans and AIs. I caught up with him again four years ago, when we arranged to meet in Cambridge, Massachusetts, for a freewheeling conversation about Al. Stephen walked in, said hello, sat down, and, looking at the video camera set up to record the conversation for Edge, began to talk and didn’t stop for two and a half hours. The essay that follows is an edited version of that session, which was a Wolfram master class of sorts and is an appropriate way to end this volume—just as Stephen’s Reality Club talk in the ’80s was a great way to initiate the ongoing intellectual enterprise whose result is the rich community of thinkers presenting their work to one another and to the public in this book. 181 HOUSE_OVERSIGHT_016984
ARTIFICIAL INTELLIGENCE AND THE FUTURE OF CIVILIZATION Stephen Wolfram Stephen Wolfram is a scientist, inventor, and the founder and CEO of Wolfram Research. He is the creator of the symbolic computation program Mathematica and its programming language, Wolfram Language, as well as the knowledge engine Wolfram|Alpha. He is also the author of A New Kind of Science. The following is an edited transcript from a live interview with him conducted in December 2015. I see technology as taking human goals and making them automatically executable by machines. Human goals of the past have entailed moving objects from here to there, using a forklift rather than our own hands. Now the work we can do automatically, with machines, is mental rather than physical. It’s obvious that we can automate many of the tasks we humans have long been proud of doing ourselves. What’s the future of the human condition in that situation? People talk about the future of intelligent machines and whether they’Il take over and decide what to do for themselves. But the inventing of goals is not something that has a path to automation. Someone or something has to define what a machine’s purpose should be—what it’s trying to execute. How are goals defined? For a given human, they tend to be defined by personal history, cultural environment, the history of our civilization. Goals are uniquely human. Where the machine is concerned, we can give it a goal when we build it. What kinds of things have intelligence, or goals, or purpose? Right now, we know one great example, and that’s us—our brains, our human intelligence. Human intelligence, I once assumed, is far beyond anything else that exists naturally in the world; it’s the result of an elaborate process of evolution and thus stands apart from the rest of existence. But what I’ve realized, as a result of the science I’ve done, is that this is not the case. People might say, for instance, “The weather has a mind of its own.” That’s an animist statement and seems to have no place in modern scientific thinking. But it’s not as silly as it sounds. What does the human brain do? A brain receives certain input, it computes things, it causes certain actions to happen, it generates a certain output. Like the weather. All sorts of systems are, effectively, doing computations—whether it’s a brain or, say, a cloud responding to its thermal environment. We can argue that our brains are doing vastly more sophisticated computations than those in the atmosphere. But it turns out that there’s a broad equivalence between the kinds of computations that different kinds of systems do. This renders the question of the human condition somewhat poignant, because it seems we’re not as special as we thought. There are all those different systems of nature that are pretty much equivalent, in terms of their computational capabilities. What makes us different from all those other systems is the particulars of our history, which give us our notions of purpose and goals. That’s a long way of saying that when the box on our desk thinks as well as the human brain does, what it still won’t have, intrinsically, are goals and purposes. Those are defined by our particulars—our particular biology, our particular psychology, our particular cultural history. 182 HOUSE_OVERSIGHT_016985
When we consider the future of AI, we need to think about the goals. That’s what humans contribute; that’s what our civilization contributes. The execution of those goals is what we can increasingly automate. What will the future of humans be in such a world? What will there be for them to do? One of my projects has been to understand the evolution of human purposes over time. Today we’ve got all kinds of purposes. If you look back a thousand years, people’s goals were quite different: How do I get my food? How do I keep myself safe? In the modern Western world, for the most part you don’t spend a large fraction of your life thinking about those purposes. From the point of view of a thousand years ago, some of the goals people have today would seem utterly bizarre—for example, like exercising on a treadmill. A thousand years ago that would sound like a crazy thing to do. What will people be doing in the future? A lot of purposes we have today are generated by scarcity of one kind or another. There are scarce resources in the world. People want to get more of something. Time itself is scarce in our lives. Eventually, those forms of scarcity will disappear. The most dramatic discontinuity will surely be when we achieve effective human immortality. Whether this will be achieved biologically or digitally isn’t clear, but inevitably it will be achieved. Many of our current goals are driven in part by our mortality: “I’m only going to live a certain time, so I'd better get this or that done.” And what happens when most of our goals are executed automatically? We won’t have the kinds of motivations we have today. One question I’d like an answer for is, What do the derivatives of humans in the future end up choosing to do with themselves? One of the potential bad outcomes is that they just play video games all the time. The term “artificial intelligence” is evolving, in its use in technical language. These days, AI is very popular, and people have some idea of what it means. Back when computers were being developed, in the 1940s and 1950s, the typical title of a book or a magazine article about computers was “Giant Electronic Brains.” The idea was that just as bulldozers and steam engines and so on automated mechanical work, computers would automate intellectual work. That promise turned out to be harder to fulfill than many people expected. There was, at first, a great deal of optimism; a lot of government money got spent on such efforts in the early 1960s. They basically just didn’t work. There are a lot of amusing science-fiction-ish portrayals of computers in the movies of that time. There’s a cute one called Desk Set, which is about an IBM-type computer being installed in a broadcasting company and putting everybody out of a job. It’s cute because the computer gets asked a bunch of reference-library questions. When my colleagues and I were building Wolfram|Alpha, one of the ideas we had was to get it to answer all of those reference-library questions from Desk Set. By 2009, it could answer them all. In 1943, Warren McCulloch and Walter Pitts came up with a model for how brains conceptually, formally, might work—an artificial neural network. They saw that their brainlike model would do computations in the same way as Turing Machines. From their work, it emerged that we could make brainlike neural networks that would act as general computers. And in fact, the practical work done by the ENIAC folks and John 183 HOUSE_OVERSIGHT_016986
von Neumann and others on computers came directly not from Turing Machines but through this bypath of neural networks. But simple neural networks didn’t do much. Frank Rosenblatt invented a learning device he called the perceptron, which was a one-layer neural network. In the late sixties, Marvin Minsky and Seymour Papert wrote a book titled Perceptrons, in which they basically proved that perceptrons couldn’t do anything interesting, which is correct. Perceptrons could only make linear distinctions between things. So the idea was more or less dropped. People said, “These guys have written a proof that neural networks can’t do anything interesting, therefore no neural networks can do anything interesting, so let’s forget about neural networks.” That attitude persisted for some time. Meanwhile, there were a couple of other approaches to AI. One was based on understanding, at a formal level, symbolically, how the world works; and the other was based on doing statistics and probabilistic kinds of things. With regard to symbolic AI, one of the test cases was, Can we teach a computer to do something like integrals? Can we teach a computer to do calculus? There were tasks like machine translation, which people thought would be a good example of what computers could do. The bottom line is that by the early seventies, that approach had crashed. Then there was a trend toward devices called expert systems, which arose in the late seventies and early eighties. The idea was to have a machine learn the rules that an expert uses and thereby figure out what to do. That petered out. After that, AI became little more than a crazy pursuit. I had been interested in how you make an AJI-like machine since I was a kid. I was interested particularly in how you take the knowledge we humans have accumulated in our civilization and automate answering questions on the basis of that knowledge. I thought about how you could do that symbolically, by building a system that could break down questions into symbolic units and answer them. I worked on neural networks at that time and didn’t make much progress, so I put it aside for a while. Back in mid-2002 to 2003, I thought about that question again: What does it take to make a computational knowledge system? The work I’d done by then pretty much showed that my original belief about how to do this was completely wrong. My original belief had been that in order to make a serious computational knowledge system, you first had to build a brainlike device and then feed it knowledge—just as humans learn in standard education. Now I realized that there wasn’t a bright line between what is intelligent and what is simply computational. I had assumed that there was some magic mechanism that made us vastly more capable than anything that was just computational. But that assumption was wrong. This insight is what led to Wolfram|Alpha. What I discovered is that you can take a large collection of the world’s knowledge and automatically answer questions on the basis of it, using what are essentially merely computational techniques. It was an alternative way to do engineering—a way that’s much more analogous to what biology does in evolution. In effect, what you normally do when you build a program 1s build it step-by-step. But you can also explore the computational universe and mine technology from that universe. Typically, the challenge is the same as in physical mining: That is, you find a supply of, let’s say, iron, or cobalt, or gadolinium, with some special magnetic properties, 184 HOUSE_OVERSIGHT_016987
and you turn that special capability to a human purpose, to something you want technology to do. In the case of magnetic materials, there are plenty of ways to do that. In terms of programs, it’s the same story. There are all kinds of programs out there, even tiny programs that do complicated things. Could we entrain them for some useful human purpose? And how do you get Als to execute your goals? One answer is to just talk to them, in the natural language of human utterances. It works pretty well when you’re talking to Siri. But when you want to say something longer and more complicated, it doesn’t work well. You need a computer language that can represent sophisticated concepts in a way that can be progressively built up and isn’t possible in natural language. What my company spent a lot of time doing was building a knowledge-based language that incorporates the knowledge of the world directly into the language. The traditional approach to creating a computer language is to make a language that represents operations that computers intrinsically know how to do: allocating memory, setting values of variables, iterating things, changing program counters, and so on. Fundamentally, you’re telling computers to do things in your own terms. My approach was to make a language that panders not to the computers but to the humans, to take whatever a human thinks of and convert it into some form that the computer can understand. Could we encapsulate the knowledge we’d accumulated, both in science and in data collection, into a language we could use to communicate with computers? That’s the big achievement of my last thirty years or so—being able to do that. Back in the 1960s, people would say things like, “When we can do such-and- such, we’ll know we have AI. When we can do an integral from a calculus course, we’ll know we have AI. When we can have a conversation with a computer and make it seem human... ,” et cetera. The difficulty was, “Well, gosh, the computer just doesn’t know enough about the world.” You’d ask the computer what day of the week it was, and it might be able to answer that. You'd ask it who the President was, and it probably couldn’t tell you. At that point, you’d know you were talking to a computer and not a person. But now when it comes to these Turing Tests, people who’ ve tried connecting, for example, Wolfram|Alpha to their Turing Test bots find that the bots lose every time. Because all you have to do 1s start asking the machine sophisticated questions and it will answer them! No human can do that. By the time you’ve asked it a few disparate questions, there will be no human who knows all those things, yet the system will know them. In that sense, we’ve already achieved good AI, at that level. Then there are certain kinds of tasks easy for humans but traditionally very hard for machines. The standard one is visual object identification: What is this object? Humans can recognize it and give some simple description of it, but a computer was just hopeless at that. A couple of years ago, though, we brought out a little image- identification system, and many other companies have done something similar—ours happens to be somewhat better than the rest. You show it an image, and for about ten thousand kinds of things, it will tell you what it is. It’s fun to show it an abstract painting and see what it says. But it does a pretty good job. It works using the same neural-network technology that McCulloch and Pitts imagined in 1943 and lots of us worked on in the early eighties. Back in the 1980s, people successfully did OCR—optical character recognition. They took the twenty-six letters of the alphabet and said, “OK, is that an A? Is that a B? Is that a C?” and so on. 185 HOUSE_OVERSIGHT_016988
That could be done for twenty-six different possibilities, but it couldn’t be done for ten thousand. It was just a matter of scaling up the whole system that makes this possible today. There are maybe five thousand picturable common nouns in English, ten thousand if you include things like special kinds of plants and beetles which people would recognize with some frequency. What we did was train our system on 30 million images of these kinds of things. It’s a big, complicated, messy neural network. The details of the network probably don’t matter, but it takes about a quadrillion GPU operations to do the training. Our system is impressive because it pretty much matches what humans can do. It has about the same training data humans have—about the same number of images a human infant would see in the first couple of years of its life. Roughly the same number of operations have to be done in the learning process, using about the same number of neurons in at least the first levels of our visual cortex. The details are different; the way these artificial neurons work has little to do with how the brain’s neurons work. But the concept is similar, and there’s a certain universality to what’s going on. At the mathematical level, it’s a composition of a very large number of functions, with certain continuity properties that let you use calculus methods to incrementally train the system. Given those attributes, you can end up with something that does the same job human brains do in physiological recognition. But does this constitute AI? There are a few basic components. There’s physiological recognition, there’s voice-to-text, there’s language translation—things humans manage to do with varying degrees of difficulty. These are essentially some of the links to how we make machines that are humanlike in what they do. For me, one of the interesting things has been incorporating those capabilities into a precise symbolic language to represent the everyday world. We now have a system that can say, “This is a glass of water.” We can go from a picture of a glass of water to the concept of a glass of water. Now we have to invent some actual symbolic language to represent those concepts. I began by trying to represent mathematical, technical kinds of knowledge and went on to other kinds of knowledge. We’ve done a pretty good job of representing objective knowledge in the world. Now the problem is to represent everyday human discourse in a precise symbolic way—a knowledge-based language intended for communication between humans and machines, so that humans can read it and machines can understand it, too. For instance, you might say “X is greater than 5.” That’s a predicate. You might also say, “I want a piece of chocolate.” That’s also a predicate. It has an “I want” init. We have to find a precise symbolic representation of the desires we express in human natural language. In the late 1600s, Gottfried Leibniz, John Wilkins, and others were concerned with what they called philosophical languages—that is, complete, universal, symbolic representations of things in the world. You can look at the philosophical language of John Wilkins and see how he divided up what was important in the world at the time. Some aspects of the human condition have been the same since the 1600s. Some are very different. His section on death and various forms of human suffering was huge; in today’s ontology, it’s alot smaller. It’s interesting to see how a philosophical language of today would differ from a philosophical language of the mid-1600s. It’s a measure of our progress. Many such attempts at formalization have happened over the years. In 186 HOUSE_OVERSIGHT_016989
mathematics, for example: Whitehead and Russell’s Principia Mathematica in 1910 was the biggest showoff effort. There were previous attempts by Gottlob Frege and Giuseppe Peano that were a little more modest in their presentation. Ultimately, they were wrong in what they thought they should formalize: They thought they should formalize some process of mathematical proof, which turns out not to be what most people care about. With regard to a modern analog of the Turing Test, it’s an interesting question. There’s still the conversational bot, which is Turing’s idea. That one hasn’t been solved yet. It will be solved—the only question is, What is the application for which it is solved? For along time I would ask, “Why should we care?”—because I thought the principal application would be customer service, which wasn’t particularly high on my list. But customer service, where you’ re trying to interface, 1s just where you need this conversational language. One big difference between Turing’s time and ours is the method of communicating with computers. In his time, you typed something into the machine and it typed back a response. In today’s world, it responds with a screen—as for instance, when you want to buy a movie ticket. How is a transaction with a machine different from a transaction with a human? The main answer ts that there’s a visual display. It asks you something, and you press a button, and you can see the result immediately. For example, in Wolfram|Alpha, when it’s used inside Sin, if there’s a short answer, Siri will tell you the short answer. But what most people want is the visual display, showing the infographic of this or that. This is a nonhuman form of communication that turns out to be richer than the traditional spoken, or typed, human communication. In most human- to-human communication, we’re stuck with pure language, whereas in computer-to- human communication we have this much higher bandwidth channel—of visual communication. Many of the most powerful applications of the Turing Test fall away now that we have this additional communication channel. For example, here’s one we’re pursuing right now. It’s a bot that communicates about writing programs: You say, “I want to write a program. I want it to do this.” The bot will say, “I’ve written this piece of program. This is what it does. Is this what you want?” Blah-blah-blah. It’s a back-and- forth bot. Devising such systems is an interesting problem, because they have to have a model of a human if they’re trying to explain something to you. They have to know what the human is confused about. What has long been difficult for me to understand is, What’s the point of a conventional Turing Test? What’s the motivation? As a toy, one could make a little chat bot that people could chat with. That will be the next thing. The current round of deep learning—particularly, recurrent neural networks—is making pretty good models of human speech and human writing. We can type in, say, “How are you feeling today?” and it knows most of the time what sort of response to give. But I want to figure out whether I can automate responding to my email. I know the answer is “No.” A good Turing Test, for me, will be when a bot can answer most of my email. That’s a tough test. It would have to learn those answers from the humans the email is connected to. I might be a little bit ahead of the game, because I’ve been collecting data on myself for about twenty-five years. I have every piece of email for twenty-five years, every keystroke for twenty. I should be able to train an avatar, an AI, that will do what I can do—perhaps better than I could. 187 HOUSE_OVERSIGHT_016990
People worry about the scenario in which Als take over. I think something much more amusing, in a sense, will happen first. The AI will know what you intend, and it will be good at figuring out how to get there. I tell my car’s GPS I want to go to a particular destination. I don’t know where the heck I am, I just follow my GPS. My children like to remind me that once when I had a very early GPS—the kind that told you, “Turn this way, turn that way”—we ended up on one of the piers going out into Boston Harbor. More to the point is that there will be an AI that knows your history, and knows that when you’re ordering dinner online you'll probably want such-and-such, or when you email this person, you should talk to them about such-and-such. More and more, the Als will suggest to us what we should do, and I suspect most of the time people will just go along with that. It’s good advice—better than what you would have figured out for yourself, As far as the takeover scenario is concerned, you can do terrible things with technology and you can do good things with technology. Some people will try to do terrible things with technology, and some people will try to do good things with technology. One of the things I like about today’s technology is the equalization it has produced. I used to be proud that I had a better computer than anybody I knew; now we all have the same kind of computers. We have the same smartphones, and pretty much the same technology can be used by a decent fraction of the planet’s 7 billion people. It’s not the case that the king’s technology is different from everybody else’s. That’s an important advance. The great frontier five hundred years ago was literacy. Today, it’s doing programming of some kind. Today’s programming will be obsolete in a not very long time. For example, people no longer learn assembly language, because computers are better at writing assembly language than humans are, and only a small set of people need to know the details of how language gets compiled into assembly language. A lot of what’s being done by armies of programmers today is similarly mundane. There’s no good reason for humans to be writing Java code or JavaScript code. We want to automate the programming process so that what’s important goes from what the human wants done to getting the machine, as automatically as possible, to doit. This will increase that equalization, which is something I’m interested in. In the past, if you wanted to write a serious piece of code, or program for something important and real, it was a lot of work. You had to know quite a bit about software engineering, you had to invest months of time in it, you had to hire programmers who knew this or you had to learn it yourself. It was a big investment. That’s not true anymore. A one-line piece of code already does something interesting and useful. It allows a vast range of people who couldn’t make computers do things for them, make computers do things for them. Something Id like to see is a lot of kids around the world learn the new capabilities of knowledge-based programming and then produce code that’s effectively as sophisticated as what anybody in the top ranks can produce. This is within reach. We’re at the point where anybody can learn to do knowledge-based programming, and, more important, learn to think computationally. The actual mechanics of programming are easy now. What’s difficult is imagining things in a computational way. 188 HOUSE_OVERSIGHT_016991
How do you teach computational thinking? In terms of how to do programming, it’s an interesting question. Take nanotechnology. How did we achieve nanotechnology? Answer: We took technology as we understand it on a large scale and we made it very small. How to make a CPU chip on the atomic scale? Fundamentally, we use the same architecture as the CPU chip we know and love. That isn’t the only approach one can take. Looking at what simple programs can do suggests that you can take even simple impoverished components and with the right compiler you can make them do interesting things. We don’t do molecular-scale computing yet, because the ambient technology is such that you’d have to spend a decade building it. But we’ve got the components that are enough to make a universal computer. You might not know how to program with those components, but by doing searches in the space of possible programs, you'd start to amass building blocks, and you could then create a compiler for them. The surprising thing is that impoverished stuff is capable of doing sophisticated things, and the compilation step is not as gruesome as you might expect. Just searching the computational universe and trying to find programs—building blocks—that are interesting is a good approach. A more traditional engineering approach—trying by pure thought to figure out how to build a universal computer—is a harder row to hoe. That doesn’t mean it can’t be done, but my guess is that we’ll be able to do some amazing things just by finding the components and searching the possible programs we can make with them. Then it’s back to the question about connecting human purposes to what is available from the system. One question I’m interested in is, What will the world look like when most people can write code? We had a transition, maybe five hundred years ago or so, when only scribes and a small set of the population could read and write natural language. Today, a small fraction of the population can write code. Most of the code they write is for computers only. You don’t understand things by reading code. But there will come a time when, as a result of things I’ve tried to do, the code is at a high enough level that it’s a minimal description of what you’re trying to do. It will be a piece of code that’s understandable to humans but also executable by the machines. Coding is a form of expression, just as writing in a natural language is a form of expression. To me, some simple pieces of code are poetic—they express ideas in a very clean way. There’s an aesthetic aspect, much as there is to expression in a natural language. One feature of code is that it’s immediately executable; it’s not like writing. When you write something, somebody has to read it, and the brain that’s reading it has to absorb the thoughts that came from the person who did the writing. Look at how knowledge has been transmitted in the history of the world. At level zero, one form of knowledge transmission is essentially genetic—that is, there’s an organism, and its progeny has the same features that it had. Then there’s the kind of knowledge transmission that happens with things like physiological recognition. A newborn creature has some neural network with some random connections in it, and as the creature moves around in the world, it starts recognizing kinds of objects and it learns that knowledge. Then there’s the level that was the big achievement of our species, which is natural language. The ability to represent knowledge abstractly enough that we can communicate it brain to brain, so to speak. Arguably, natural language is our species’ most important invention. It’s what led, in many respects, to our civilization. 189 HOUSE_OVERSIGHT_016992
There’s yet another level, and probably one day it will have a more interesting name. With knowledge-based programming, we have a way of creating an actual representation of real things in the world, in a precise and symbolic way. Not only is it understandable by brains and communicable to other brains and to computers, it’s also immediately executable. Just as natural language gave us civilization, knowledge-based programming will give us—what? One bad answer is that it will give us the civilization of the Als. That’s what we don’t want to happen, because the Als will do a great job communicating with one another and we’ll be left out of it, because there’s no intermediate language, no interface with our brains. What will this fourth level of knowledge communication lead to? If you were Caveman Ogg and you were just realizing that language was starting, could you imagine the coming of civilization? What should we be imagining right now? This relates to the question of what the world would look like if most people could code. Clearly, many trivial things would change: Contracts would be written in code, restaurant recipes might be written in code, and so on. Simple things like that would change. But much more profound things would also change. The rise of literacy gave us bureaucracy, for example, which had already existed but dramatically accelerated, giving us a greater depth of governmental systems, for better or worse. How does the coding world relate to the cultural world? Take high school education. If we have computational thinking, how does that affect how we study history? How does that affect how we study languages, social studies, and so on? The answer is, it has a great effect. Imagine you’re writing an essay. Today, the raw material for a typical high school student’s essay is something that’s already been written; students usually can’t generate new knowledge easily. But in the computational world, that will no longer be true. If the students know something about writing code, they’ll access all that digitized historical data and figure out something new. Then they’ll write an essay about something they’ve discovered. The achievement of knowledge-based programming is that it’s no longer sterile, because it’s got the knowledge of the world knitted into the language you’ re using to write code. There’s computation all over the universe: in a turbulent fluid producing some complicated pattern of flow, in the celestial mechanics of planetary interactions, in brains. But does computation have a purpose? You can ask that about any system. Does the weather have a goal? Does climate have a goal? Can someone looking at Earth from space tell that there’s anything with a purpose there? Is there a civilization there? In the Great Salt Lake, in Utah, there’s a straight line. It turns out to be a causeway dividing two areas of the lake with different colors of algae, so it’s a very dramatic straight line. There’s a road in Australia that’s long and straight. There’s a railroad in Siberia that’s long, and lights go on when a train stops at the stations. So from space you can see straight lines and patterns. But are these clear enough examples of obvious purpose on Earth as viewed from space? For that matter, how do we recognize extraterrestrials out there? How do we tell if a signal we’re getting indicates purpose? Pulsars were discovered in 1967, when we picked up a periodic flutter every second or so. The first question was, Is this a beacon? 190 HOUSE_OVERSIGHT_016993
Because what else would make a periodic signal? It turned out to be a rotating neutron star. One criterion to apply to a potentially purposeful phenomenon is whether it’s minimal in achieving a purpose. But does that mean that it was built for the purpose? The ball rolls down the hill because of gravitational pull. Or the ball rolls down the hill because it’s satisfying the principle of least action. There are typically these two explanations for some action that seems purposeful: the mechanistic explanation and the teleological. Essentially all of our existing technology fails the test of being minimal in achieving its purpose. Most of what we build is steeped in technological history, and it’s incredibly non-minimal for achieving its purpose. Look at a CPU chip; there’s no way that that’s the minimal way to achieve what a CPU chip achieves. This question of how to identify purposefulness is a hard one. It’s an important question, because radio noise from the galaxy is very similar to CDMA transmissions from cell phones. Those transmissions use pseudo-noise sequences, which happen to have certain repeatability properties. But they come across as noise, and they’re set up as noise, so as not to interfere with other channels. The issue gets messier. If we were to observe a sequence of primes being generated from a pulsar, we’d ask what generated them. Would it mean that a whole civilization grew up and discovered primes and invented computers and radio transmitters and did this? Or is there just some physical process making primes? There’s a little cellular automaton that makes primes. You can see how it works if you take it apart. It has a little thing bouncing inside it, and out comes a sequence of primes. It didn’t need the whole history of civilization and biology and so on to get to that point. I don’t think there is abstract “purpose,” per se. I don’t think there’s abstract meaning. Does the universe have a purpose? Then you’re doing theology in some way. There is no meaningful sense in which there is an abstract notion of purpose. Purpose is something that comes from history. One of the things that might be true about our world is that maybe we go through all this history and biology and civilization, and at the end of the day the answer is “42,” or something. We went through all those 4 billion years of various kinds of evolution and then we got to “42.” Nothing like that will happen, because of computational irreducibility. There are computational processes that you can go through in which there is no way to shortcut that process. Much of science has been about shortcutting computation done by nature. For example, if we’re doing celestial mechanics and want to predict where the planets will be a million years from now, we could follow the equations, step-by-step. But the big achievement in science is that we’re able to shortcut that and reduce the computation. We can be smarter than the universe and predict the endpoint without going through all the steps. But even with a smart enough machine and smart enough mathematics, we can’t get to the endpoint without going through the steps. Some details are irreducible. We have to irreducibly follow those steps. That’s why history means something. If we could get to the endpoint without going through the steps, history would be, in some sense, pointless. So it’s not the case that we’re intelligent and everything else in the world is not. There’s no enormous abstract difference between us and the clouds or us and the cellular automata. We cannot say that this brainlike neural network is qualitatively 191 HOUSE_OVERSIGHT_016994
different from this cellular-automaton system. The difference is a detailed difference. This brainlike neural network was produced by the long history of civilization, whereas the cellular automaton was created by my computer in the last microsecond. The problem of abstract AI is similar to the problem of recognizing extraterrestrial intelligence: How do you determine whether or not it has a purpose? This is a question I don’t consider answered. We’ll say things like, “Well, AI will be intelligent when it can do blah-blah-blah. ” When it can find primes. When it can produce this and that and the other. But there are many other ways to get to those results. Again, there is no bright line between intelligence and mere computation. It’s another part of the Copernican story: We used to think Earth was the center of the universe. Now we think we’re special because we have intelligence and nothing else does. I’m afraid the bad news is that that isn’t a distinction. Here’s one of my scenarios. Let’s say there comes a time when human consciousness 1s readily uploadable into digital form, virtualized and so on, and pretty soon we have a box of a trillion souls. There are a trillion souls in the box, all virtualized. In the box, there will be molecular computing going on—maybe derived from biology, maybe not. But the box will be doing all kinds of elaborate stuff. And there’s a rock sitting next to the box. Inside a rock, there are always all kinds of elaborate stuff going on, all kinds of subatomic particles doing all kinds of things. What’s the difference between the rock and the box of a trillion souls? The answer is that the details of what’s happening in the box were derived from the long history of human civilization, including whatever people watched on YouTube the day before. Whereas the rock has its long geological history but not the particular history of our civilization. Realizing that there isn’t a genuine distinction between intelligence and mere computation leads you to imagine that future—the endpoint of our civilization as a box of trillion souls, each of them essentially playing a video game, forever. What is the “purpose” of that? 192 HOUSE_OVERSIGHT_016995





































































































































































































































