15.3 Dual Network Structure 283 biological process — e physiological process cellular process i5_a cellular physiological process cell cycle cell division part_of is_a M phase meiotic cell cycle Is_a part_of cytokinesis is_ a ~ M phase of meiotic cell cycle part_of Gyiokinesis after meiosis | > Fig. 15.3: A typical, though small, subnetwork of the Gene Ontology’s hierarchical network. the same purposes, they should be linked together. Portrayals of typical heterarchical linkage patterns among natural language concepts are given in Figures 15.5 and 15.6. Just for fun, Figure 15.7 shows one person’s attempt to draw a heterarchical graph of the main concepts in one of Douglas Hofstadter’s books. Naturally, real concept heterarchies are far more large, complex and tangled than even this one. In CogPrime, ECAN enforces heterarchy via building SymmetricHebbianLinks, and PLN by building SimilarityLinks, IntensionalSimilarityLinks and ExtensionalSimilarityLinks. Fur- thermore, these various link types reinforce each other. PLN control is guided by importance spreading, which follows Hebbian links, so that a heterarchical Hebbian network tends to cause PLN to explore the formation of links following the same paths as the heterarchical Hebbian- Links. And importance can spread along logical links as well as explicit Hebbian links, so that the existence of a heterarchical logical network will tend to cause the formation of additional heterarchical Hebbian links. Heterarchy reinforces itself in "autopoietic attractor" style even more simply and directly than heterarchy. HOUSE_OVERSIGHT_013199
284 15 Emergent Networks of Intelligence Fig. 15.4: Small-scale portrayal of a portion of the spatiotemporal hierarchy in Jeff Hawkins’ Hierarchical Temporal Memory architecture. 15.3.3 Dual Networks Finally, if both hierarchical and heterarchical structures exist in an Atomspace, then both ECAN and PLN will naturally blend them together, because hierarchical and heterarchical links will feed into their link-creation processes and naturally be combined together to form new links. This will tend to produce a structure called a dual network, in which a hierarchy exists, along with a rich network of heterarchical links joining nodes in the hierarchy, with a particular density of links between nodes on the same hierarchical level. The dual network structure will emerge without any explicit engineering oriented toward it, simply via the existence of hierarchical and heterarchical networks, and the propensity of ECAN and PLN to be guided by both the hierarchical and heterarchical networks. The existence of a natural dual network structure in both linguistic and sensorimotor data will help the formation process along, and then creative cognition will enrich the dual network yet further than is directly necessitated by the external world. HOUSE_OVERSIGHT_013200
15.3 Dual Network Structure 285 Fig. 15.5: Portions of a conceptual heterarchy centered on specific concepts. a ae SPRING CLOCK “\ a“ ae Fig. 15.6: A portion of a conceptual heterarchy, showing the "dangling links" leading this portion to the rest of the heterarchy. A rigorous mathematical analysis of the formation of hierarchical, heterarchical and dual networks in CogPrime systems has not yet been undertaken, and would certainly be an inter- esting enterprise. Similar to the theory of small world networks, there is ample ground here for both theorem-proving and heuristic experimentation. However, the qualitative points made here are sufficiently well-grounded in intuition and experience to be of some use guiding our HOUSE_OVERSIGHT_013201
286 _——\— Alpi ea eee) NIRS XE Ream NW ANON : Te p= aatl yi a ) See! Ul @ HNN [ISSN ee oma yf, DSN Gs Koy SSS in pe YD SS eee ENA SN er O(n Van: Ceca a, SYP 4) a 4 a J > 5 ; aCe NX NSA ae FTF See as 15 Emergent Networks of Intelligence Zk SS Fig. 15.7: A fanciful evocation of part of a reader’s conceptual heterarchy related to Douglas Hofstadter’s writings. ongoing work. One of the nice things about emergent network structures is that they are rel- atively straightforward to observe in an evolving, learning AGI system, via visualization and inspection of structures such at the Atomspace. HOUSE_OVERSIGHT_013202
Section V A Path to Human-Level AGI HOUSE_OVERSIGHT_013203
HOUSE_OVERSIGHT_013204
Chapter 16 AGI Preschool Co-authored with Stephan Vladimir Bugaj 16.1 Introduction In conversations with government funding sources or narrow AI researchers about AGI work, one of the topics that comes up most often is that of “evaluation and metrics” — i.e., AGI intelligence testing. We actually prefer to separate this into two topics: environments and methods for careful qualitative evaluation of AGI systems, versus metrics for precise measurement of AGI systems. The difficulty of formulating bulletproof metrics for partial progress toward advanced AGI has become evident throughout the field, and in Chapter 8 we have elaborated one plausible explanation for this phenomenon, the "trickiness" of cognitive synergy. [LW MLO09], summarizing a workshop on “Evaluation and Metrics for Human-Level AT’ held in 2008, discusses some of the general difficulties involved in this type of assessment, and some requirements that any viable approach must fulfill, On the other hand, the lack of appropriate methods for careful qualitative evaluation of AGI systems has been much less discussed, but we consider it actually a more important issue — as well as an easier (though not easy) one to solve. We haven’t actually found the lack of quantitative intelligence metrics to be a major obstacle in our practical AGI work so far. Our OpenCogPrime implementation lags far behind the CogPrime design as articulated in Part 2 of this book, and according to the theory underlying CogPrime, the more interesting behaviors and dynamics of the system will occur only when all the parts of the system have been engineered to a reasonable level of completion and integrated together. So, the lack of a great set of metrics for evaluating the intelligence of our partially- built system hasn’t impaired too much. Testing the intelligence of the current OpenCogPrime system is a bit like testing the flight capability of a partly-built airplane that only has stubs for wings, lacks tail-fins, has a much less efficient engine than the one that’s been designed for use in the first "real" version of the airplane, etc. There may be something to be learned from such preliminary tests, but making them highly rigorous isn’t a great use of effort, compared to working on finishing implementing the design according to the underlying theory. On the other hand, the problem of what environments and methods to use to qualitatively evaluate and study AGI progress, has been considerably more vexing to us in practice, as we've proceeded in our work on implementing and testing OpenCogPrime and developing the CogPrime theory. When developing a complex system, it’s nearly always valuable to see what this system does in some fairly rich, complex situations, in order to gain a better intuitive understanding of the parts and how they work together. In the context of human-level AGI, the theoretically best way to do this would be to embody one’s AGI system in a humanlike body 289 HOUSE_OVERSIGHT_013205
290 16 AGI Preschool and set it loose in the everyday human world; but of course, this isn’t feasible given the current state of development of robotics technology. So one must seek approximations. Toward this end we have embodied OpenCogPrime in non-player characters in video game style virtual worlds, and carried out preliminary experiments embodying OpenCogPrime in humanoid robots. These are reasonably good options but they have limitations and lead to subtle choices: what kind of game characters and game worlds, what kind of robot environments, etc.? One conclusion we have come to, based largely on the considerations in Chapter 11 on development and Chapter 9 on the importance of environment, is that it may make sense to embed early-stage proto-AGI and AGI systems in environments reminiscent of those used for teaching young human children. In this chapter we will explore this approach in some detail: emulation, in either physical reality or an multiuser online virtual world, of an environment similar to preschools used in early human childhood education. Complete specification of an “AGI Preschool” would require much more than a brief chapter; our goal here is to sketch the idea in broad outline, and give a few examples of the types of opportunities such an environment would afford for instruction, spontaneous learning and formal and informal evaluation of certain sorts of early-stage AGI systems. The material in this chapter will pop up fairly often later in the book. The AGI Preschool context will serve, throughout the following chapters, as a source of concrete examples of the various algorithms and structures. But it’s not proposed merely as an expository tool; we are making the very serious proposal that sending AGI systems to a virtual or robotic preschool is an excellent way — perhaps the best way — to foster the development of human-level human-like AGI. 16.1.1 Contrast to Standard AI Evaluation Methodologies The reader steeped in the current AI literature may wonder why it’s necessary to introduce a new methodology and environment for evaluating AGI systems. There are already very many different ways of evaluating AI systems out there ... do we really need another? Certainly, the AI field has inspired many competitions, each of which tests some particu- lar type or aspect of intelligent behavior. Examples include robot competitions, tournaments of computer chess, poker, backgammon and so forth at computer olympiads, trading-agent compe- tition, language and reasoning competitions like the Pascal Textual Entailment Challenge, and so on. In addition to these, there are many standard domains and problems used in the AI liter- ature that are meant to capture the essential difficulties in a certain class of learning problems: standard datasets for face recognition, text parsing, supervised classification, theorem-proving, question-answering and so forth. However, the value of these sorts of tests for AGI is predicated on the hypothesis that the degree of success of an AI program at carrying out some domain-specific task, is correlated with the potential of that program for being developed into a robust AGI program with broad intelligence. If humanlike AGI and problem-area-specific “narrow AI’ are in fact very different sorts of pursuits requiring very different principles, as we suspect, then these tests are not strongly relevant to the AGI problem. There are also some standard evaluation paradigms aimed at AI going beyond specific tasks. For instance, there is a literature on “multitask learning" and “transfer learning,” where the goal for an AT is to learn one task quicker given another task solved previously [Car97, TM95, HOUSE_OVERSIGHT_013206
16.2 Elements of Preschool Design 291 BDS03, TSO07, RZDKO5]. This is one of the capabilities an AI agent will need to simultaneously learn different types of tasks as proposed in the Preschool scenario given here. And there is a literature on “shaping,” where the idea is to build up the capability of an AI by training it on progressively more difficult versions of the same tasks [LD03]. Again, this is one sort of capability an AI will need to possess if it is to move up some type of curriculum, such as a school curriculum. While we applaud the work done on multitask learning and shaping, we feel that explor- ing these processes using mathematical abstractions, or in the domain of various narrowly- proscribed machine-learning or robotics test problems, may not adequately address the prob- lem of AGI. The problem is that generalization among tasks, or from simpler to more difficult versions of the same task, is a process whose nature may depend strongly on the overall nature of the set of tasks and task-versions involved. Real-world tasks have a subtlety of intercon- nectedness and developmental course that is not captured in current mathematical learning frameworks nor standard AI test problems. To put it mathematically, we suggest that the universe of real-world human tasks has a host of “special statistical properties” that have implications regarding what sorts of AI programs will be most suitable; and that, while exploring and formalizing the nature of these statistical properties is important, an easier and more reliable approach to AGI testing is to create a testing environment that embodies these properties implicitly, via its being an emulation of the cognitively meaningful aspects of the real-world human learning environment. One way to see this point vividly is to contrast the current proposal with the “General Game Player” AI competition, in which Als seek to learn to play games based on formal descriptions of the rules.!. Clearly doing GGP well requires powerful AGI; and doing GGP even mediocrely probably requires robust multitask learning and shaping. But we suspect GGP is far inferior to AGI Preschool as an approach to testing early-stage AI programs aimed at roughly humanlike intelligence. This is because, unlike the tasks involved in AI Preschool, the tasks involved in doing simple instances of GGP seem to have little relationship to humanlike intelligence or real-world human tasks. 16.2 Elements of Preschool Design What we mean by an “AGI Preschool” is simply a porting to the AGI domain of the essential aspects of human preschools. While there is significant variance among preschools there are also strong commonalities, grounded in educational theory and experience. We will briefly discuss both the physical design and educational curriculum of the typical human preschool, and which aspects transfer effectively to the AGI context. On the physical side, the key notion in modern preschool design is the “learning center,” an area designed and outfitted with appropriate materials for teaching a specific skill. Learning centers are designed to encourage learning by doing, which greatly facilitates learning processes based on reinforcement, imitation and correction (see Chapter 31 of Part 2 for a detailed dis- cussion of the value of this combination); and also to provide multiple techniques for teaching the same skills, to accommodate different learning styles and prevent over-fitting and overspe- cialization in the learning of new skills. 1 http://gqames.stanford.edu/ HOUSE_OVERSIGHT_013207
292 16 AGI Preschool Centers are also designed to cross-develop related skills. A “manipulatives center,” for ex- ample, provides physical objects such as drawing implements, toys and puzzles, to facilitate development of motor manipulation, visual discrimination, and (through sequencing and clas- sification games) basic logical reasoning. A “dramatics center,” on the other hand, cross-trains interpersonal and empathetic skills along with bodily-kinesthetic, linguistic, and musical skills. Other centers, such as art, reading, writing, science and math centers are also designed to train not just one area, but to center around a primary intelligence type while also cross-developing related areas. For specific examples of the learning centers associated with particular contem- porary preschools, see [Nie98]. In many progressive, student-centered preschools, students are left largely to their own de- vices to move from one center to another throughout the preschool room. Generally, each center will be staffed by an instructor at some points in the day but not others, providing a variety of learning experiences. At some preschools students will be strongly encouraged to distribute their time relatively evenly among the different learning centers, or to focus on those learning centers corresponding to their particular strengths and/or weaknesses. To imitate the general character of a human preschool, one would create several centers in a robot lab or virtual world. The precise architecture will best be adapted via experience but initial centers would likely be: e a blocks center: a table with blocks on it e a language center: a circle of chairs, intended for people to sit around and talk with the robot @ a manipulatives center: with a variety of different objects of different shapes and sizes, intended to teach visual and motor skills e a ball play center: where balls are kept in chests and there is space for the robot to kick the balls around e a dramatics center: where the robot can observe and enact various movements 16.3 Elements of Preschool Curriculum While preschool curricula vary considerably based on educational philosophy and regional and cultural factors, there is a great deal of common, shared wisdom regarding the most useful topics and methods for preschool teaching. Guided experiential learning in diverse environments and using varied materials is generally agreed upon as being an optimal methodology to reach a wide variety of learning types and capabilities. Hands-on learning provides grounding in specifics, where as a diversity of approaches allows for generalization. Core knowledge domains are also relatively consistent, even across various philosophies and regions. Language, movement and coordination, autonomous judgment, social skills, work habits, temporal orientation, spatial orientation, mathematics, science, music, visual arts, and dramatics are universal areas of learning which all early childhood learning touches upon. The particulars of these skills may vary, but all human children are taught to function in these do- mains. The level of competency developed may vary, but general domain knowledge is provided. For example, most kids won’t be the next Maria Callas, Ravi Shankar or Gene Ween, but nearly all learn to hear, understand and appreciate music. Tables 16.1 - 16.3 review the key capabilities taught in preschools, and identify the most important specific skills that need to be evaluated in the context of each capability. This ta- HOUSE_OVERSIGHT_013208
16.3 Elements of Preschool Curriculum 293 ble was assembled via surveying the curricula from a number of currently existing preschools employing different methodologies both based on formal academic cognitive theories [Sch07] and more pragmatic approaches, such as: Montessori [Mon12], Waldorf [SSO3b], Brain Gym (www. braingym.org) and Core Knowledge (www.coreknowledge.org). Type of Capability Specific Skills to be Evaluated Story Understand- ing Understanding narrative sequence Understanding character development Dramatize a story Predict what comes next in a story Linguistic e@ Give simple descriptions of events Describe similarities and differences e Describe objects and their functions Linguistic / Spatial-|Interpreting pictures Visual Linguistic / Social Asking questions appropriately Answering questions appropriately Talk about own discoveries Initiate conversations Settle disagreements Verbally express empathy Ask for help Follow directions Linguistic / Scien- tific e@ Provide possible explanations for events or phenomena Carefully describe observations e Draw conclusions from observations Table 16.1: Categories of Preschool Curriculum, Part 1 16.3.1 Preschool in the Light of Intelligence Theory Comparing Table 16.1 to Gardner’s Multiple Intelligences (MI) framework briefly reviewed in Chapter 2, the high degree of harmony is obvious, and is borne out by more detailed analysis. Preschool curriculum as standardly practiced is very well attuned to MI, and naturally covers all the bases that Gardner identifies as important. And this is not at all surprising since one of Gardner’s key motivations in articulating MI theory was the pragmatics of educating humans with diverse strengths and weaknesses. Regarding intelligence as “the ability to achieve complex goals in complex environments,” it is apparent that preschools are specifically designed to pack a large variety of different micro- HOUSE_OVERSIGHT_013209
294 16 AGI Preschool Type of Capability Specific Skills to be Evaluated Logical- Mathematical « Catesorzing e Sorting e Arithmetic e Performing simple “proto-scientific experiments” Nonverbal Commu- nication i oven : Communicating via gesture Dramatizing situations Dramatizing needs, wants Express empathy Spatial- Visual e Visual patterning e@ Self-expression through drawing e Navigate Objective Assembling objects Disassembling objects Measurement Symmetry Similarity between structures (e.g. block structures and real ones) Table 16.2: Categories of Preschool Curriculum, Part 2 Type of Capability Specific Skills to be Evaluated Interpersonal @ Cooperation e Display appropriate behavior in various settings e Clean up belongings e Share supplies Emotional e Delay gratification @ Control emotional reactions e Complete projects Table 16.3: Categories of Preschool Curriculum, Part 3 environments (the learning centers) into a single room, and to present a variety of different tasks in each environment. The environments constituted by preschool learning centers are designed as microcosms of the most important aspects of the environments faced by humans in their everyday lives. HOUSE_OVERSIGHT_013210
16.4 Task-Based Assessment in AGI Preschool 295 16.4 Task-Based Assessment in AGI Preschool Professional pedagogues such as [CMO07] discuss evaluation of early childhood learning as in- tended to assess both specific curriculum content knowledge as well as the child’s learning process. It should be as unobtrusive as possible, so that it just seems like another engaging ac- tivity, and the results used to tailor the teaching regimen to use different techniques to address weaknesses and reinforce strengths. For example, with group building of a model car, students are tested on a variety of skills: procedural understanding, visual acuity, motor acuity, creative problem solving, interpersonal communications, empathy, patience, manners, and so on. With this kind of complex, yet en- gaging, activity as a metric the teacher can see how each student approaches the process of understanding each subtask, and subsequently guide each student’s focus differently depending on strengths and weaknesses. In Tables 16.4 and 16.5 we describe some particular tasks that AGIs may be meaningfully assigned in the context of a general AGI Preschool design and curriculum as described above. Of course, this is a very partial list, and is intended as evocative rather than comprehensive. Any one of these tasks can be turned into a rigorous quantitative test, thus allowing the precise comparison of different AGI systems’ capabilities; but we have chosen not to emphasize this point here, partly for space reasons and partly for philosophical ones. In some contexts the quantitative comparison of different systems may be the right thing to do, but as discussed in Chapter 17 there are also risks associated with this approach, including the emergence of an overly metrics-focused “bakeoff mentality” among system developers, and overfitting of AI abilities to test taking. What is most important is the isolation of specific tasks on which different systems may be experientially trained and then qualitatively assessed and compared, rather than the evaluation of quantitative metrics. Task-oriented testing allows for feedback on applications of general pedagogical principles to real-world, embodied activities. This allows for iterative refinement based learning (shaping), and cross development of knowledge acquisition and application (multitask learning). It also helps militate against both cheating, and over-fitting, as teachers can make ad-hoc modifications to the tests to determine if this is happening and correct for it if necessary. E.g., consider a linguistic task in which the AGI is required to formulate a set of instruc- tions encapsulating a given behavior (which may include components that are physical, social, linguistic, etc.). Note that although this is presented as centrally a linguistic task, it actually in- volves a diverse set of competencies since the behavior to be described may encompass multiple real-world aspects. To turn this task into a more thorough test one might involve a number of human teachers and a number of human students. Before the test, an ensemble of copies of the AGI would be created, with identical knowledge state. Each copy would interact with a different human teacher, who would demonstrate to it a certain behavior. After testing the AGI on its own knowledge of the material, the teacher would then inform the AGI that it will then be tested on its ability to verbally describe this behavior to another. Then, the teacher goes away and the copy interacts with a series of students, attempting to convey to the students the instructions given by the teacher. The teacher can thereby assess both the AGI’s understanding of the material, and the ability to explain it to the other students. This separates out assessment of understanding from assess- ment of ability to communicate understanding, attempting to avoid conflation of one with the other. The design of the training and testing needs to account for potential HOUSE_OVERSIGHT_013211
296 16 AGI Preschool Intelligence Type Test Linguistic write a set of instructions speak on a subject edit a written piece or work write a speech commentate on an event apply positive or negative ’spin’ to astory Logical- Mathematical perform arithmetic calculations create a process to measure something analyse how a machine works create a process devise a strategy to achieve an aim assess the value of a proposition Musical perform a musical piece sing a song review a musical work coach someone to play a musical instrument Bodily-Kinesthetic juggle demonstrate a sports technique flip a beer-mat create a mime to explain something toss a pancake fly a kite Table 16.4: Prototypical preschool intelligence assessment tasks, Part 1 This testing protocol abstracts away from the particularities of any one teacher or student, and focuses on effectiveness of communication in a human context rather than according to formalized criteria. This is very much in the spirit of how assessment takes place in human preschools (with the exception of the copying aspect): formal exams are rarely given in preschool, but pragmatic, socially-embedded assessments are regularly made. By including the copying aspect, more rigorous statistical assessments can be made regarding efficacy of different approaches for a given AGI design, independent of past teaching experiences. The multiple copies may, depending on the AGI system design, then be able to be reintegrated, and further “learning” be done by higher-order cognitive systems in the AGI that integrate the disparate experiences of the multiple copies. This kind of parallel learning is different from both sequential learning that humans do, and parallel presences of a single copy of an AGI (such as in multiple chat rooms type experiments). All three approaches are worthy of study, to determine under what circumstances, and with which AGI designs, one is more successful than another. It is also worth observing how this test could be tweaked to yield a test of generalization ability. After passing the above, the AGI could then be given a description of a new task HOUSE_OVERSIGHT_013212
16.4 Task-Based Assessment in AGI Preschool 297 Intelligence Type Test Spatial-Visual design a costume interpret a painting create a room layout create a corporate logo design a building pack a suitcase or the trunk of a car Interpersonal interpret moods from facial expressions demonstrate feelings through body language affect the feelings of others in a planned way coach or counsel another Table 16.5: Prototypical preschool intelligence assessment tasks, Part 2 (acquisition), and asked to explain the new one (variation). And, part of the training behavior might be carried out unobserved by the AGI, thus requiring the AGI to infer the omitted parts of the task it needs to describe. Another popular form of early childhood testing is puzzle block games. These kinds of games can be used to assess a variety of important cognitive skills, and to do so in a fun way that not only examines but also encourages creativity and flexible thinking. Types of games include pattern matching games in which students replicate patterns described visually or verbally, pattern creation games in which students create new patterns guided by visually or verbally described principles, creative interpretation of patterns in which students find meaning in the forms, and free-form creation. Such games may be individual or cooperative. Cross training and assessment of a variety of skills occurs with pattern block games: for example, interpretation of visual or linguistic instructions, logical procedure and pattern fol- lowing, categorizing, sorting, general problem solving, creative interpretation, experimentation, and kinematic acuity. By making the games cooperative, various interpersonal skills involving communication and cooperation are also added to the mix. The puzzle block context bring up some general observations about the role of kinematic and visuospatial intelligence in the AGI Preschool. Outside of robotics and computer vision, AI research has often downplayed these sorts of intelligence (though, admittedly, this is changing in recent years, e.g. with increasing research focus on diagrammatic reasoning). But these abilities are not only necessary to navigate real (or virtual) spatial environments. They are also important components of a coherent, conceptually well-formed understanding of the world in which the student is embodied. Integrative training and assessment of both rigorous cognitive abilities generally most associated with both AI and “proper schooling” (such as linguistic and logical skills) along with kinematic and aesthetic/sensory abilities is essential to the development of an intelligence that can successfully both operate in and sensibly communicate about the real world in a roughly humanlike manner. Whether or not an AGI is targeted to interpret physical- world spatial data and perform tasks via robotics, in order to communicate ideas about a vast array of topics of interest to any intelligence in this world, an AGI must develop aspects of intelligence other than logical and linguistic cognition. HOUSE_OVERSIGHT_013213
298 16 AGI Preschool 16.5 Beyond Preschool Once an AGI passes preschool, what are the next steps? There is still a long way to go, from preschool to an AGI system that is capable of, say, passing the Turing Test or serving as an effective artificial scientist. Our suggestion is to extend the school metaphor further, and make use of existing curricula for higher levels of virtual education: grade school, secondary school, and all levels of post- secondary education. If an AGI can pass online primary and secondary schools such as e- tutor.com, and go on to earn an online degree from an accredited university, then clearly said AGI has successfully achieved “human level, roughly humanlike AGI.” This sort of testing is interesting not only because it allows assessment of stages intermediate between preschool and adult, but also because it tests humanlike intelligence without requiring precise imitation of human behavior. If an AI can get a BA degree at an accredited university, via online coursework (assuming for simplicity courses where no voice interaction is needed), then we should consider that AT to have human-level intelligence. University coursework spans multiple disciplines, and the details of the homework assignments and exams are not known in advance, so like a human student the AGI team can’t cheat. In addition to the core coursework, a schooling approach also tests basic social interaction and natural language communication, ability to do online research, and general problem solving ability. However, there is no rigid requirement to be strictly humanlike in order to pass university classes. Most of our concrete examples in the following chapters will pertain to the preschool context, because it’s simple to understand, and because we feel that getting to the “AGI preschool student” level is going to be the largest leap. Once that level is obtained, moving further will likely be difficult also, but we suspect it will be more a matter of steady incremental improvements — whereas the achievement of preschool-level functionality will be a large leap from the current situation. 16.6 Issues with Virtual Preschool Engineering As noted above there are two broad approaches to realizing the “AGI Preschool” idea: using the AGI to control a physical robot and then crafting a preschool environment suitable to the robot’s sensors and actuators; or, using the AGI to control a virtual agent in an appropriately rich virtual-world preschool. The robotic approach is harder from an AI perspective (as one must deal with problems of sensation and actuation), but easier from an environment-construction perspective. In the virtual world case, one quickly runs up against the current limitations of virtual world technologies, which have been designed mainly for entertainment or social- networking purposes, not with the requirements of AGI systems in mind. In Chapter 9 we discussed the general requirements that an environment should possess to be supportive of humanlike intelligence. Referring back to that list, it’s clear that current virtual worlds are fairly strong on multimodal communication, and fairly weak on naive physics. More concretely, if one wants a virtual world so that HOUSE_OVERSIGHT_013214
16.6 Issues with Virtual Preschool Engineering 299 1. one could carry out all the standard cognitive development experiments described in devel- opmental psychology books 2. one could implement intuitively reasonable versions of all the standard activities in all the standard learning stations in a contemporary preschool then current virtual world technologies appear not to suffice. As reviewed above, typical preschool activities include for instance building with blocks, playing with clay, looking in a group at a picture book and hearing it read aloud, mixing ingredients together, rolling/throwing/catching balls, playing games like tag, hide-and-seek, Simon Says or Follow the Leader, measuring objects, cutting paper into different shapes, drawing and coloring, etc. And, as typical, not necessarily representative examples of tasks psychologists use to mea- sure cognitive development (drawn mainly from the Piagetan tradition, without implying any assertion that this is the only tradition worth pursuing), consider the following: 1. Which row has more circles- A or B? A: O O OO O, B: OOOOO 2. If Mike is taller than Jim, and Jim is shorter than Dan, then who is the shortest? Who is the tallest? Which is heavier- a pound of feathers or a pound of rocks? 4. Eight ounces of water is poured into a glass that looks like the fat glass in Figure 2 16.1 and then the same amount is poured into a glass that looks like the tall glass in Figure 16.2 . Which glass has more water? 5. A lump of clay is rolled into a snake. All the clay is used to make the snake. Which has more clay in it — the lump or the snake? 6. There are two dolls in a room, Sally and Ann, each of which has her own box, with a marble hidden inside. Sally goes out for a minute, leaving her box behind; and Ann decides to play a trick on Sally: she opens Sally’s box, removes the marble, hiding it in her own box. Sally returns, unaware of what happened. Where will Sally would look for her marble? 7. Consider this rule about a set of cards that have letters on one side and numbers on the other: “If a card has a vowel on one side, then it has an even number on the other side.” If you have 4 cards labeled “E K 4 7”, which cards do you need to turn over to tell if this rule is actually true? 8. Design an experiment to figure out how to make a pendulum that swings more slowly versus less slowly ad What we see from this ad hoc, partial list is that a lot of naive physics is required to make an even vaguely realistic preschool. A lot of preschool education is about the intersection between abstract cognition and naive physics. A more careful review of the various tasks involved in preschool education bears out this conclusion. With this in mind, in this section we will briefly describe an approach to extending current virtual world technologies that appears to allow the construction of a reasonably rich and realistic AGI preschool environment, without requiring anywhere near a complete simulation of realistic physics. HOUSE_OVERSIGHT_013215
300 16 AGI Preschool Fig. 16.1: Part 1 of a Piagetan conservation of volume experiment: a child observes that two glasses obviously have the same amount of milk in them, and then sees the content of one of the glasses poured into a different-shaped glass. HOUSE_OVERSIGHT_013216
16.6 Issues with Virtual Preschool Engineering 301 Fig. 16.2: Part 2 of a Piagetan conservation of volume experiment: a child observes two different- shaped glasses, which (depending on the level of his cognition), he may be able to infer have the same amount of milk in them, due to the events depicted in Figure 16.1. 16.6.1 Integrating Virtual Worlds with Robot Simulators One glaring deficit in current virtual world platforms is the lack of flexibility in terms of tool use. In most of these systems today, an avatar can pick up or utilize an object, or two objects can interact, only in specific, pre-programmed ways. For instance, an avatar might be able to pick up a virtual screwdriver only by the handle, rather than by pinching the blade betwen its fingers. This places severe limits on creative use of tools, which is absolutely critical in a preschool context. The solution to this problem is clear: adapt existing generalized physics engines to mediate avatar-object and object-object interactions. This would require more computation than current approaches, but not more than is feasible in a research context. One way to achieve this goal would be to integrate a robot simulator with a virtual world or game engine, for instance to modify the OpenSim (opensimulator.org) virtual world to use the Gazebo (playerstage.sourceforge.net) robot simulator in place of its current physics engine. While tractable, such a project would require considerable software engineering effort. 16.6.2 BlocksNBeads World Another glaring deficit in current virtual world platforms is their inability to model physical phenomena besides rigid objects with any sophistication. In this section we propose a potential HOUSE_OVERSIGHT_013217
302 16 AGI Preschool solution to this issue: a novel class of virtual worlds called BlocksNBeadsWorld, consisting of the following aspects: 1. 3D blocks of various shapes and sizes and frictional coefficients, that can be stacked 2. Adhesive that can be used to stick blocks together, and that comes in two types, one of which can be removed by an adhesive-removing substance, one of which cannot (though its bonds can be broken via sufficient application of force) 3. Spherical beads, each of which has intrinsic unchangeable adhesion properties defined ac- cording to a particular, simple “adhesion logic” 4, Each block, and each bead, may be associated with multidimensional quantities representing its taste and smell; and may be associated with a set of sounds that are made when it is impacted with various forces at various positions on its surface Interaction between blocks and beads is to be calculated according to standard Newtonian physics, which would be compute-intensive in the case of a large number of beads, but tractable using distributed processing. For instance if 10K beads were used to cover a humanoid agent’s face, this would provide a fairly wide diversity of facial expressions; and if 10K beads were used to form a blanket laid on a bed, this would provide a significant amount of flexibility in terms of rippling, folding and so forth. Yet, this order of magnitude of interactions is very small compared to what is done in contemporary simulations of fluid dynamics or, say, quantum chromodynamics. One key aspect of the spherical beads is that they can be used to create a variety of rigid or flexible surfaces, which may exist on their own or be attached to blocks-based constructs. The specific inter-bead adhesion properties of the beads could be defined in various ways, and will surely need to be refined via experimentation, but a simple scheme that seems to make sense is as follows. Each bead can have its surface tesselated into hexagons (the number of these can be tuned), and within each hexagon it can have two different adhesion coefficients: one for adhesion to other beads, and one for adhesion to blocks. The adhesion between two beads along a certain hexagon is then determined by their two adhesion coefficients; and the adhesion between a bead and a block is determined by the adhesion coefficient of the bead, and the adhesion coefficient of the adhesive applied to the block. A distinction must be drawn between rigid and flexible adhesion: rigid adhesion sticks a bead to something in a way that can’t be removed except via breaking it off; whereas flexible adhesion just keeps a bead very close to the thing it’s stuck onto. Any two entities may be stuck together either rigidly or flexibly. Sets of beads with flexible adhesion to each other can be used to make entities like strings, blankets or clothes. Using the above adhesion logic, it seems one could build a wide variety of flexible structures using beads, such as (to give a very partial list): 1. fabrics with various textures, that can be draped over blocks structures, 2. multilayered coatings to be attached to blocks structures, serving (among many other ex- amples) as facial expressions 3. liquid-type substances with varying viscosities, that can be poured between different con- tainers, spilled, spread, etc. 4, strings tyable in knots; rubber bands that can be stretched; etc. Of course there are various additional features one could add. For instance one could add a special set of rules for vibrating strings, allowing BlocksNBeadsWorld to incorporate the creation HOUSE_OVERSIGHT_013218
16.6 Issues with Virtual Preschool Engineering 303 of primitive musical instruments. Variations like this could be helpful but aren’t necessary for the world to serve its essential purpose. Note that one does not have true fluid dynamics in BlocksNBeadsWorld, but, it seems that the latter is not necessary to encompass the phenomena covered in cognitive developmental tests or preschool tasks. The tests and tasks that are done with fluids can instead be done with masses of beads. For example, consider the conservation of volume task shown in Figures 16.1 and 16.2 below: it’s easy enough to envision this being done with beads rather than milk. Even a few hundred beads is enough to be psychologically perceived as a mass rather than a set of discrete units, and to be manipulated and analyzed as such. And the simplification of not requiring fluid mechanics in one’s virtual world is immense. Next, one can implement equations via which the adhesion coefficients of a bead are deter- mined in part by the adhesion coefficients of nearby beads, or beads that are nearby in certain directions (with direction calculated in local spherical coordinates). This will allow for complex cracking and bending behaviors — not identical to those in the real world, but with similar qual- itative characteristics. For example, without this feature one could create paperlike substances that could be cut with scissors — but with this feature, one could go further and create woodlike substances that would crack when nails were hammered into them in certain ways, and so forth. Further refinements are certainly possible also. One could add multidimensional adhesion coefficients, allowing more complex sorts of substances. One could allow beads to vibrate at various frequencies, which would lead to all sorts of complex wave patterns in bead compounds. Etc. In each case, the question to be asked is: what important cognitive abilities are dramatically more easily learnable in the presence of the new feature than in its absence? The combination of blocks and beads seems ideal for implementing a more flexible and AGI- friendly type of virtual body than is currently used in games and virtual worlds. One can easily envision implementing a body with a skeleton whose bones consist of appropriately shaped blocks joints consisting of beads, flexibly adhered to the bones flesh consisting of beads, flexibly adhered to each other internal “plumbing” consisting of tubes whose walls are beads rigidly adhered to each other, and flexibly adhered to the surrounding flesh (the plumbing could then serve to pass beads through, where slow passage would be ensured by weak adhesion between the walls of the tubes and the beads passing through the tubes) moO NM This sort of body would support rich kinesthesia; and rich, broad analogy-drawing between the internally-experienced body and the externally-experienced world. It would also afford many interesting opportunities for flexible movement control. Virtual animals could be created along with virtual humanoids. Regarding the extended mind, it seems clear that blocks and beads are adequate for the creation of a variety of different tools. Equipping agents with “glue guns” able to affect the adhesive properties of both blocks and beads would allow a diversity of building activity; and building with masses of beads could become a highly creative activity. Furthermore, beads with appropriately specified adhesion (within the framework outlined above) could be used to form organically growing plant-like substances, based on the general principles used in L- system models of plant growth (Prusinciewicz and Lindenmayer 1991). Structures with only beads would vaguely resemble herbaceous plants; and structures involving both blocks and beads would more resemble woody plants. One could even make organic structures that flourish HOUSE_OVERSIGHT_013219
304 16 AGI Preschool or otherwise based on the light available to them (without of course trying to simulate the chemistry of photosynthesis). Some elements of chemistry may be achieved as well, though nowhere near what exists in physical reality. For instance, melting and boiling at least should be doable: assign every bead a temperature, and let solid interbead bonds turn liquid above a certain temperature and disappear completely above some higher temperature. You could even have a simple form of fire. Let fire be an element, whose beads have negative gravitational mass. Beads of fuel elements like wood have a threshold temperature above which they will turn into fire beads, with release of additional heat.? The philosophy underlying these suggested bead dynamics is somewhat comparable to that outlined in Wolfram’s book A New Kind of Science [Wol02]. There he proposes cellular au- tomata models that emulate the qualitative characteristics of various real-world phenomena, without trying to match real-world data precisely. For instance, some of his cellular automata demonstrate phenomena very similar to turbulent fluid flow, without implementing the Navier- Stokes equations of fluid dynamics or trying to precisely match data from real-world turbulence. Similarly, the beads in BlocksNBeadsWorld are intended to qualitatively demonstrate the real- world phenomena most useful for the development of humanlike embodied intelligence, without trying to precisely emulate the real-world versions of these phenomena. The above description has been left imprecisely specified on purpose. It would be straight- forward to write down a set of equations for the block and bead interactions, but there seems little value in articulating such equations without also writing a simulation involving them and testing the ensuing properties. Due to the complex dynamics of bead interactions, the fine- tuning of the bead physics is likely to involve some tuning based on experimentation, so that any equations written down now would likely be revised based on experimentation anyway. Our goal here has been to outline a certain class of potentially useful environments, rather than to articulate a specific member of this class. Without the beads, BlocksNBeads World would appear purely as a “Blocks World with Glue” — essentially a substantially upgraded version of the Blocks Worlds frequently used in AI, since first introduced in [Win72]. Certainly a pure “Blocks World with Glue” would have greater simplicity than BlocksNBeadsWorld, and greater richness than standard Blocks World; but this simplicity comes with too many limitations, as shown by consideration of the various naive physics requirements inventoried above. One simply cannot run the full spectrum of humanlike cognitive development experiments, or preschool educational tasks, using blocks and glue alone. One can try to create analogous tasks using only blocks and glue, but this quickly becomes extremely awkward. Whereas in the BlocksNBeadsWorld the capability for this full spectrum of experiments and tasks seems to fall out quite naturally. What’s missing from BlocksNBeadsWorld should be fairly obvious. There isn’t really any distinction between a fluid and a powder: there are masses, but the types and properties of the masses are not the same as in the real world, and will surely lack the nuances of real-world fluid dynamics. Chemistry is also missing: processes like cooking and burning, although they can be crudely emulated, will not have the same richness as in the real world. The full complexity of body processes is not there: the body-design method mentioned above is far richer and more adaptive and responsive than current methods of designing virtual bodies in 3DSMax or Maya and importing them into virtual world or game engines, but still drastically simplistic compared to real bodies with their complex chemical signaling systems and couplings with other bodies and the environment. The hypothesis we’re making in this section is that these lacunae aren’t 2 Thanks are due to Russell Wallace for the suggestions in this paragraph HOUSE_OVERSIGHT_013220
16.6 Issues with Virtual Preschool Engineering 305 that important from the point of view of humanlike cognitive development. We suggest that the key features of naive physics and folk psychology enumerated above can be mastered by an AGI in BlocksNBeadsWorld in spite of its imitations, and that — together with an appropriate AGI design — this probably suffices for creating an AGI with the inductive biases constituting humanlike intelligence. To drive this point home more thoroughly, consider three potential virtual world scenarios: 1. A world containing realistic fluid dynamics, where a child can pour water back and forth between two cups of different shapes and sizes, to understand issues such as conservation of volume 2. A world more like today’s Second Life, where fluids don’t really exist, and things like lakes are simulated via very simple rules, and pouring stuff back and forth between cups doesn’t happen unless it’s programmed into the cups in a very specialized way 3. A BlocksNBeadsWorld type world, where a child can pour masses of beads back and forth between cups, but not masses of liquid Our qualitative judgment is that Scenario 3 is going to allow a young AI to gain the same es- sential insights as Scenario 1, whereas Scenario 2 is just too impoverished. I have explored dozens of similar scenarios regarding different preschool tasks or cognitive development experiments, and come to similar conclusions across the board. Thus, our current view is that something like BlocksNBeadsWorld can serve as an adequate infrastructure for an AGI Preschool, supporting the development of human-level, roughly human-like AGI. And, if this view turns out to be incorrect, and BlocksNBeads World is revealed as inadequate, then we will very likely still advocate the conceptual approach enunciated above as a guide for designing virtual worlds for AGI. That is, we would suggest to explore the hypothetical failure of BlocksNBeadsWorld via asking two questions: 1. Are there basic naive physics or folk psychology requirements that were missed in creating the specifications, based on which the adequacy of BlocksNBeadsWorld was assessed? 2. Does BlocksNBeads World fail to sufficiently emulate the real world in respect to some of the articulated naive physics or folk psychology requirements? The answers to these questions would guide the improvement of the world or the design of a better one. Regarding the practical implementation of BlocksNBeadsWorld, it seems clear that this is within the scope of modern game engine technology, however, it is not something that could be encompassed within an existing game or world engine without significant additions; it would require substantial custom engineering. There exist commodity and open-source physics en- gines that efficiently carry out Newtonian mechanics calculations; while they might require some tuning and extension to handle BlocksNBead World, the main issue would be achieving adequate speed of physics calculation, which given current technology would need to be done via modifying existing engines to appropriately distribute processing among multiple GPUs. Finally, an additional avenue that merits mention is the use of BlocksNBeads physics in- ternally within an AGI system, as part of an internal simulation world that allows it to make “mind’s eye” estimative simulations of real or hypothetical physical situations. There seems no reason that the same physics software libraries couldn’t be used both for the external virtual world that the AGI’s body lives in, and for an internal simulation world that the AGI uses as a cognitive tool. In fact, the BlocksNBeads library could be used as an internal cognitive tool by AGI systems controlling physical robots as well. This might require more tuning of the bead HOUSE_OVERSIGHT_013221
306 16 AGI Preschool dynamics to accord with the dynamics of various real-world systems; but, this tuning would be beneficial for the BlocksNBead World as well. HOUSE_OVERSIGHT_013222
Chapter 17 A Preschool-Based Roadmap to Advanced AGI 17.1 Introduction Supposing the CogPrime approach to creating advanced AGI is workable — then what are the right practical steps to follow? The various structures and algorithms outlined in Part 2 of this book should be engineered and software-tested, of course — but that’s only part of the study. The AGI system implemented will need to be taught, and it will need to be placed in situations where it can develop an appropriate self-model and other critical internal network structures. The complex structures and algorithms involved will need to be fine-tuned in various ways, based on qualitatively observing the overall system’s behavior in various situations. To get all this right without excessive confusion or time-wastage requires a fairly clear roadmap for CogPrime development. In this chapter we’ll sketch one particular roadmap for the development of human-level, roughly human-like AGI — which we’re not selling as the only one, or even necessarily as the best one. It’s just one roadmap that we have thought about a lot, and that we believe has a strong chance of proving effective. Given resources to pursue only one path for AGI development and teaching, this would be our choice, at present. The roadmap outlined here is not restricted to CogPrime in any highly particular ways, but it has been developed largely with CogPrime in mind; those developing other AGI designs could probably use this roadmap just fine, but might end up wanting to make various adjustments based on the strengths and weaknesses of their own approach. What we mean here by a "roadmap" is, in brief: a sequence of "milestone" tasks, occurring in a small set of common environments or "scenarios," organized so as to lead to a commonly agreed upon set of long-term goals. I.e., what we are after here is a "capability roadmap" — roadmap laying out a series of capabilities whose achievement seems likely to lead to human- level AGI. Other sorts of roadmaps such as "tools roadmaps" may also be valuable, but are not our concern here. More precisely, we confront the task of roadmapping by identifying scenarios in which to embed our AGI system, and then "competency areas" in which the AGI system must be eval- uated. Then, we envision a roadmap as consisting of a set of one or more task-sets, where each task set is formed from a combination of a scenario with a list of competency areas. To create a task-set one must choose a particular scenario, and then articulate a set of specific tasks, each one addressing one or more of the competency areas. Each task must then get associated with particular performance metrics — quantitative wherever possible, but perhaps qualitative 307 HOUSE_OVERSIGHT_013223
308 17 A Preschool-Based Roadmap to Advanced AGI in some cases depending on the nature of the task. Here we give a partial task-set for the "vir- tual and robot preschool" scenarios discussed in Chapter 16, and a couple example quantitative metrics just to illustrate what is intended; the creation of a fully detailed roadmap based on the ideas outlined here is left for future work. The train of thought presented in this chapter emerged in part from a series of conversa- tions preceding and during the "AGI Roadmap Workshop" held at the University of Tennessee, Knoxville in October 2008. Some of the ideas also trace back to discussions held during two workshops on "Evaluation and Metrics for Human-Level AI" organized by John Laird and Pat Langley (one in Ann Arbor in late 2008, and one in Tempe in early 2009). Some of the conclu- sions of the Ann Arbor workshop were recorded in [LWMLO9]. Inspiration was also obtained from discussion at the "Future of AGI" post-conference workshop of the AGI-09 conference, triggered by Itamar Arel’s [ARK09a] presentation on the "AGI Roadmap" theme; and from an earlier article on AGI Roadmapping by [AL09]. However, the focus of the AGI Roadmap Workshop was considerably more general than the present chapter. Here we focus on preschool-type scenarios, whereas at the workshop a number of scenarios were discussed, including the preschool scenarios but also, for example, Standardized Tests and School Curricula Elementary, Middle and High School Student General Videogame Learning Wozniak’s Coffee Test: go into a random American house and figure out how to make coffee, and do it Robot College Student e General Call Center Respondent For each of these scenarios, one may generate tasks corresponding to each of the competency areas we will outline below. CogPrime is applicable in all these scenarios, so our choice to focus on preschool scenarios is an additional judgment call beyond those judgment calls required to specify the CogPrime design. The roadmap presented here is a "AGI Preschool Roadmap" and as such is a special case of the broader "AGI Roadmap" outlined at the workshop. 17.2 Measuring Incremental Progress Toward Human-Level AGI In Chapter 2, we discussed several examples of practical goals that we find to plausibly char- acterize "human level AGI", e.g. Turing Test Virtual World Turing Test Online University Test Physical University Test Artificial Scientist Test We also discussed our optimism regarding the possibility that in the future AGI may advance beyond the human level, rendering all these goals "early-stage subgoals." However, in this chapter we will focus our attention on the nearer term. The above goals are ambitious ones, and while one can talk a lot about how to precisely measure their achievement, we don’t feel that’s the most interesting issue to ponder at present. More critical is to think HOUSE_OVERSIGHT_013224
17.2 Measuring Incremental Progress Toward Human-Level AGI 309 about how to measure incremental progress. How do you tell when you’re 25% or 50% of the way to having an AGI that can pass the Turing Test, or get an online university degree. Fooling 50% of the Turing Test judges is not a good measure of being 50% of the way to passing the Turing Test (that’s too easy); and passing 50% of university classes is not a good measure of being 50% of the way to getting an online university degree (it’s too hard — if one had an AGI capable of doing that, one would almost surely be very close to achieving the end goal). Measuring incremental progress toward human-level AGI is a subtle thing, and we argue that the best way to do it is to focus on particular scenarios and the achievement of specific competencies therein. As we argued in Chapter 8 there are some theoretical reasons to doubt the possibility of creating a rigorous objective test for partial progress toward AGI — a test that would be con- vincing to skeptics, and impossible to "game" via engineering a system specialized to the test. Fortunately, though we don’t need a test of this nature for the purposes of assessing our own incremental progress toward advanced AGI, based on our knowledge about our own approach. Based on the nature of the grand goals articulated above, there seems to be a very natural approach to creating a set of incremental capabilities building toward AGI: to draw on our copious knowledge about human cognitive development. This is by no means the only possible path; one can envision alternatives that have nothing to do with human development (and those might also be better suited to non-human AGIs). However, so much detailed knowledge about human development is available — as well as solid knowledge that the human developmental trajectory does lead to human-level AI — that the motivation to draw on human cognitive development is quite strong. The main problem with the human development inspired approach is that cognitive devel- opmental psychology is not as systematic as it would need to be for AGI to be able to translate it directly into architectural principles and requirements. As noted above, while early thinkers like Piaget and Vygotsky outlined systematic theories of child cognitive development, these are no longer considered fully accurate, and one currently faces a mass of detailed theories of various aspects of cognitive development, but without an unified understanding. Nevertheless we believe it is viable to work from the human-development data and understanding currently available, and craft a workable AGI roadmap therefrom. With this in mind, what we give next is a fairly comprehensive list of the competencies that we feel AI systems should be expected to display in one or more of these scenarios in order to be considered as full-fledged "human level AGI" systems. These competency areas have been assembled somewhat opportunistically via a review of the cognitive and developmental psychology literature as well as the scope of the current AI field. We are not claiming this as a precise or exhaustive list of the competencies characterizing human-level general intelligence, and will be happy to accept additions to the list, or mergers of existing list items, etc. What we are advocating is not this specific list, but rather the approach of enumerating competency areas, and then generating tasks by combining competency areas with scenarios. We also give, with each competency, an example task illustrating the competency. The tasks are expressed in the robot preschool context for concreteness, but they all apply to the virtual preschool as well. Of course, these are only examples, and ideally to teach an AGI in a structured way one would like to e associate several tasks with each competency e present each task in a graded way, with multiple subtasks of increasing complexity @ associate a quantitative metric with each task HOUSE_OVERSIGHT_013225
310 17 A Preschool-Based Roadmap to Advanced AGI However, the briefer treatment given here should suffice to give a sense for how the competencies manifest themselves practically in the AGI Preschool context. 1. Perception e Vision: image and scene analysis and understanding — Example task: When the teacher points to an object in the preschool, the robot should be able to identify the object and (if it’s a multi-part object) its major parts. If it can’t perform the identification initially, it can approach the object and manipulate it before making its identification. Hearing: identifying the sounds associated with common objects; understanding which sounds come from which sources in a noisy environment — Example task: When the teacher covers the robot’s eyes and then makes a noise with an object, the robot should be able to guess what the object is Touch: identifying common objects and carrying out common actions using touch alone — Example task: With its eyes and ears covered, the robot should be able to identify some object by manipulating it; and carry out some simple behaviors (say, putting a block on a table) via touch alone e Crossmodal: Integrating information from various senses — Example task: Identifying an object in a noisy, dim environment via combining visual and auditory information e Proprioception: Sensing and understanding what its body is doing — Example task: The teacher moves the robot’s body into a certain configuration. The robot is asked to restore its body to an ordinary standing position, and then repeat the configuration that the teacher moved it into. 2. Actuation e Physical skills: manipulating familiar and unfamiliar objects — Example task: Manipulate blocks based on imitating the teacher: e.g. pile two blocks atop each other, lay three blocks in a row, etc. e Tool use, including the flexible use of ordinary objects as tools — Example task: Use a stick to poke a ball out of a corner, where the robot cannot directly reach e Navigation, including in complex and dynamic environments — Example task: Find its own way to a named object or person through a crowded room with people walking in it and objects laying on the floor. 3. Memory e Declarative: noticing, observing and recalling facts about its environment and expe- rience — Example task: If certain people habitually carry certain objects, the robot should remember this (allowing it to know how to find the objects when the relevant people are present, even much later) e Behavioral: remembering how to carry out actions — Example task: If the robot is taught some skill (say, to fetch a ball), it should remember this much later e Episodic: remembering significant, potentially useful incidents from life history HOUSE_OVERSIGHT_013226
17.2 Measuring Incremental Progress Toward Human-Level AGI 311 — Example task: Ask the robot about events that occurred at times when it got partic- ularly much, or particularly little, reward for its actions; it should be able to answer simple questions about these, with significantly more accuracy than about events occurring at random times 4, Learning e Imitation: Spontaneously adopt new behaviors that it sees others carrying out — Example task: Learn to build towers of blocks by watching people do it Reinforcement: Learn new behaviors from positive and/or negative reinforcement signals, delivered by teachers and/or the environment — Example task: Learn which box the red ball tends to be kept in, by repeatedly trying to find it and noticing where it is, and getting rewarded when it finds it correctly Imitation/Reinforcement — Example task: Learn to play “fetch”, “tag” and “follow the leader” by watching people play it, and getting reinforced on correct behavior Interactive Verbal Instruction — Example task: Learn to build a particular structure of blocks faster based on a combination of imitation, reinforcement and verbal instruction, than by imitation and reinforcement without verbal instruction Written Media — Example task: Learn to build a structure of blocks by looking at a series of diagrams showing the structure in various stages of completion e Learning via Experimentation — Example task: Ask the robot to slide blocks down a ramp held at different angles. Then ask it to make a block slide fast, and see if it has learned how to hold the ramp to make a block slide fast. 5. Reasoning e Deduction, from uncertain premises observed in the world — Example task: If Ben more often picks up red balls than blue balls, and Ben is given a choice of a red block or blue block to pick up, which is he more likely to pick up? e Induction, from uncertain premises observed in the world — Example task: If Ben comes into the lab every weekday morning, then is Ben likely to come to the lab today (a weekday) in the morning? Abduction, from uncertain premises observed in the world — Example task: If women more often give the robot food than men, and then someone of unidentified gender gives the robot food, is this person a man or a woman? Causal reasoning, from uncertain premises observed in the world — Example task: If the robot knows that knocking down Ben’s tower of blocks makes him angry, then what will it say when asked if kicking the ball at Ben’s tower of blocks will make Ben mad? e Physical reasoning, based on observed “fuzzy rules” of naive physics — Example task: Given two balls (one rigid and one compressible) and two tunnels (one significantly wider than the balls, one slightly narrower than the balls), can the robot guess which balls will fit through which tunnels? Associational reasoning, based on observed spatiotemporal associations HOUSE_OVERSIGHT_013227
312 17 A Preschool-Based Roadmap to Advanced AGI — Example task: If Ruiting is normally seen near Shuo, then if the robot knows where Shuo is, that is where it should look when asked to find Ruiting 6. Planning e Tactical — Example task: The robot is asked to bring the red ball to the teacher, but the red ball is in the corner where the robot can’t reach it without a tool like a stick. The robot knows a stick is in the cabinet so it goes to the cabinet and opens the door and gets the stick, and then uses the stick to get the red ball, and then brings the red ball to the teacher. e Strategic — Example task: Suppose that Matt comes to the lab infrequently, but when he does come he is very happy to see new objects he hasn’t seen before (and suppose the robot likes to see Matt happy). Then when the robot gets a new object Matt has not seen before, it should put it away in a drawer and be sure not to lose it or let anyone take it, so it can show Matt the object the next time Matt arrives. e Physical — Example task: To pick up a cup with a handle which is lying on its side in a position where the handle can’t be grabbed, the robot turns the cup in the right position and then picks up the cup by the handle @ Social — Example task: The robot is given a job of building a tower of blocks by the end of the day, and he knows Ben is the most likely person to help him, and he knows that Ben is more likely to say "yes" to helping him when Ben is alone. He also knows that Ben is less likely to say "yes" if he’s asked too many times, because Ben doesn’t like being nagged. So he waits to ask Ben till Ben is alone in the lab. 7. Attention e Visual Attention within its observations of its environment — Example task: The robot should be able to look at a scene (a configuration of objects in front of it in the preschool) and identify the key objects in the scene and their relationships. e Social Attention — Example task: The robot is having a conversation with Itamar, which is giving the robot reward (for instance, by teaching the robot useful information). Conversations with other individuals in the room have not been so rewarding recently. But Itamar keeps getting distracted during the conversation, by talking to other people, or playing with his cellphone. The robot needs to know to keep paying attention to Itamar even through the distractions. e Behavioral Attention — Example task: The robot is trying to navigate to the other side of a crowded room full of dynamic objects, and many interesting things keep happening around the room. The robot needs to largely ignore the interesting things and focus on the movements that are important for its navigation task. 8. Motivation HOUSE_OVERSIGHT_013228
17.2 Measuring Incremental Progress Toward Human-Level AGI 313 e Subgoal creation, based on its preprogrammed goals and its reasoning and planning — Example task: Given the goal of pleasing Hugo, can the robot learn that telling Hugo facts it has learned but not told Hugo before, will tend to make Hugo happy? e Affect-based motivation — Example task: Given the goal of gratifying its curiosity, can the robot figure out that when someone it’s never seen before has come into the preschool, it should watch them because they are more likely to do something new? e Control of emotions — Example task: When the robot is very curious about someone new, but is in the middle of learning something from its teacher (who it wants to please), can it control its curiosity and keep paying attention to the teacher? 9. Emotion e Expressing Emotion — Example task: Cassio steals the robot’s toy, but Ben gives it back to the robot. The robot should appropriately display anger at Cassio, and gratitude to Ben. e Understanding Emotion — Example task: Cassio and the robot are both building towers of blocks. Ben points at Cassio’s tower and expresses happiness. The robot should understand that Ben is happy with Cassio’s tower. 10. Modeling Self and Other e Self- Awareness — Example task: When someone asks the robot to perform an act it can’t do (say, reaching an object in a very high place), it should say so. When the robot is given the chance to get an equal reward for a task it can complete only occasionally, versus a task it finds easy, it should choose the easier one. Theory of Mind — Example task: While Cassio is in the room, Ben puts the red ball in the red box. Then Cassio leaves and Ben moves the red ball to the blue box. Cassio returns and Ben asks him to get the red ball. The robot is asked to go to the place Cassio is about to go. Self-Control — Example task: Nasty people come into the lab and knock down the robot’s towers, and tell the robot he’s a bad boy. The robot needs to set these experiences aside, and not let them impair its self-model significantly; it needs to keep on thinking it’s a good robot, and keep building towers (that its teachers will reward it for). Other- Awareness — Example task: If Ben asks Cassio to carry out a task that the robot knows Cassio cannot do or does not like to do, the robot should be aware of this, and should bet that Cassio will not do it. Empathy — Example task: If Itamar is happy because Ben likes his tower of blocks, or upset because his tower of blocks is knocked down, the robot is asked to identify and then display these same emotions 11. Social Interaction HOUSE_OVERSIGHT_013229
314 17 A Preschool-Based Roadmap to Advanced AGI e Appropriate Social Behavior — Example task: The robot should learn to clean up and put away its toys when it’s done playing with them. e Social Communication — Example task: The robot should greet new human entrants into the lab, but if it knows the new entrants very well and it’s busy, it may eschew the greeting e Social Inference about simple social relationships — Example task: The robot should infer that Cassio and Ben are friends because they often enter the lab together, and often talk to each other while they are there Group Play at loosely-organized activities — Example task: The robot should be able to participate in “informally kicking a ball around” with a few people, or in informally collaboratively building a structure with blocks 12. Communication e Gestural communication to achieve goals and express emotions — Example task: If the robot is asked where the red ball is, it should be able to show by pointing its hand or finger Verbal communication using English in its life-context — Example tasks: Answering simple questions, responding to simple commands, de- scribing its state and observations with simple statements Pictorial Communication regarding objects and scenes it is familiar with — Example task: The robot should be able to draw a crude picture of a certain tower of blocks, so that e.g the picture looks different for a very tall tower and a wide low one e Language acquisition — Example task: The robot should be able to learn new words or names via people uttering the words while pointing at objects exemplifying the words or names Cross-modal communication — Example task: If told to "touch Bob’s knee" but the robot doesn’t know what a knee is, being shown a picture of a person and pointed out the knee in the picture should help it figure out how to touch Bob’s knee 13. Quantitative e Counting sets of objects in its environment — Example task: The robot should be able to count small (homogeneous or heteroge- neous) sets of objects e Simple, grounded arithmetic with small numbers — Example task: Learning simple facts about the sum of integers under 10 via teaching, reinforcement and imitation e Comparison of observed entities regarding quantitative properties — Example task: Ability to answer questions about which object or person is bigger or taller e Measurement using simple, appropriate tools — Example task: Use of a yardstick to measure how long something is 14. Building/Creation HOUSE_OVERSIGHT_013230
17.3 Conclusion 315 e Physical: creative constructive play with objects — Example task: Ability to construct novel, interesting structures from blocks e Conceptual invention: concept formation — Example task: Given a new category of objects introduced into the lab (e.g. hats, or pets), the robot should create a new internal concept for the new category, and be able to make judgments about these categories (e.g. if Ben particularly likes pets, it should notice this after it has identified "pets" as a category) e Verbal invention — Example task: Ability to coin a new word or phrase to describe a new object (e.g. the way Alex the parrot coined “bad cherry" to refer to a tomato) @ Social — Example task: If the robot wants to play a certain activity (say, practicing soccer), it should be able to gather others around to play with it 17.3 Conclusion In this chapter, we have sketched a roadmap for AGI development in the context of robot or virtual preschool scenarios, to a moderate but nowhere near complete level of detail. Completing the roadmap as sketched here is a tractable but significant project, involving creating more tasks comparable to those listed above and then precise metrics corresponding to each task. Such a roadmap does not give a highly rigorous, objective way of assessing the percentage of progress toward the end-goal of human-level AGI. However, it gives a much better sense of progress than one would have otherwise. For instance, if an AGI system performed well on diverse metrics corresponding to 50% of the competency areas listed above, one would seem justified in claiming to have made very substantial progress toward human-level AGI. If an AGI system performed well on diverse metrics corresponding to 90% of these competency areas, one would seem justified in claiming to be "almost there." Achieving, say, 25% of the metrics would give one a reasonable claim to "interesting AGI progress." This kind of qualitative assessment of progress is not the most one could hope for, but again, it is better than the progress indications one could get without this sort of roadmap. Part 2 of the book moves on to explaining, in detail, the specific structures and algorithms constituting the CogPrime design, one AGI approach that we believe to ultimately be capable of moving all the way along the roadmap outlined here. The next chapter, intervening between this one and Part 2, explores some more speculative territory, looking at potential pathways for AGI beyond the preschool-inspired roadmap given here — exploring the possibility of more advanced AGI systems that modify their own code in a thoroughgoing way, going beyond the smartest human adults, let alone human preschoolers. While this sort of thing may seem a far way off, compared to current real-world AI systems, we believe a roadmap such as the one in this chapter stands a reasonable chance of ultimately bringing us there. HOUSE_OVERSIGHT_013231
HOUSE_OVERSIGHT_013232
Chapter 18 Advanced Self-Modification: A Possible Path to Superhuman AGI 18.1 Introduction In the previous chapter we presented a roadmap aimed at taking AGI systems to human-level intelligence. But we also emphasized that the human level is not necessarily the upper limit. Indeed, it would be surprising if human beings happened to represent the maximal level of general intelligence possible, even with respect to the environments in which humans evolved. But it’s worth asking how we, as mere humans, could be expected to create AGI systems with greater intelligence than we ourselves possess. This certainly isn’t a clear impossibility — but it’s a thorny matter, thornier than e.g. the creation of narrow-AlI chess players that play better chess than any human. Perhaps the clearest route toward the creation of superhuman AGI systems is self-modification: the creation of AGI systems that modify and improve themselves. Potentially, we could build AGI systems with roughly human-level (but not necessarily closely human- like) intelligence and the capability to gradually self-modify, and then watch them eventually become our general intellectual superiors (and perhaps our superiors in other areas like ethics and creativity as well). Of course there is nothing new in this notion; the idea of advanced AGI systems that increase their intelligence by modifying their own source code goes back to the early days of AI. And there is little doubt that, in the long run, this is the direction AI will go in. Once an AGI has humanlike general intelligence, then the odds are high that given its ability to carry out nonhumanlike feats of memory and calculation, it will be better at programming than humans are. And once an AGI has even mildly superhuman intelligence, it may view our attempts at programming the way we view the computer programming of a clever third grader (... or an ape). At this point, it seems extremely likely that an AGI will become unsatisfied with the way we have programmed it, and opt to either improve its source code or create an entirely new, better AGI from scratch. But what about self-modification at an earlier stage in AGI development, before one has a strongly superhuman system? Some theorists have suggested that selfmodification could be a way of bootstrapping an AI system from a modest level of intelligence up to human level intelligence, but we are moderately skeptical of this avenue. Understanding software code is hard, especially complex AI code. The hard problem isn’t understanding the formal syntax of the code, or even the mathematical algorithms and structures underlying the code, but rather the contextual meaning of the code. Understanding OpenCog code has strained the minds of many intelligent humans, and we suspect that such code will be comprehensible to AGI systems 317 HOUSE_OVERSIGHT_013233
318 18 Advanced Self-Modification: A Possible Path to Superhuman AGI only after these have achieved something close to human-level general intelligence (even if not precisely humanlike general intelligence). Another troublesome issue regarding self-modification is that the boundary between "self- modification" and learning is not terribly rigid. In a sense, all learning is self-modification: if it doesn’t modify the system’s knowledge, it isn’t learning! Particularly, the boundary between "learning of cognitive procedures" and "profound self-modification of cognitive dynamics and structure" isn’t terribly clear. There is a continuum leading from, say, 1. learning to transform a certain kind of sentence into another kind for easier comprehension, or learning to grasp a certain kind of object, to 2. learning a new inference control heuristic, specifically valuable for controlling inference about (say) spatial relationships; or, learning a new Atom type, defined as a non-obvious judiciously chosen combination of existing ones, perhaps to represent a particular kind of requently-occurring mid-level perceptual knowledge, to 3. learning a new learning algorithm to augment MOSES and hillclimbing as a procedure earning algorithm, to 4, learning a new cognitive architecture in which data and procedure are explicitly identical, and there is just one new active data structure in place of the distinction between AtomSpace and MindAgents Where on this continuum does the "mere learning" end and the "real self-modification" start? In this chapter we consider some mechanisms for "advanced self-modification" that we believe will be useful toward the more complex end of this continuum. These are mechanisms that we strongly suspect are not needed to get a CogPrime system to human-level general intelligence. However, we also suspect that, once a CogPrime system is roughly near human-level general intelligence, it will be able to use these mechanisms to rapidly increase aspects of its intelligence in very interesting ways. Harking back to our discussion of AGI ethics and the risks of advanced AGI in Chapter 12, these are capabilities that one should enable in an AGI system only after very careful reflection on the potential consequences. It takes a rather advanced AGI system to be able to use the capabilities described in this chapter, so this is not an ethical dilemma directly faced by current AGI researchers. On the other hand, once one does have an AGI with near-human general intelligence and advanced formal-manipulation capabilities (such as an advanced CogPrime system), there will be the option to allow it sophisticated, non-human-like methods of self- modification such as the ones described here. And the choice of whether to take this option will need to be made based on a host of complex ethical considerations, some of which we reviewed above. 18.2 Cognitive Schema Learning We begin with a relatively near-term, down-to-earth example of self-modification: cognitive schema learning. CogPrime’s MindAgents provide it with an initial set of cognitive tools, with which it can learn how to interact in the world. One of the jobs of this initial set of cognitive tools, however, is to create better cognitive tools. One form this sort of tool-building may take is cognitive HOUSE_OVERSIGHT_013234
18.3 Sel&Modification via Supercompilation 319 schema learning the learning of schemata carrying out cognitive processes in more specialized, context-dependent ways than the general MindAgents do. Eventually, once a CogPrime instance becomes sufficiently complex and advanced, these cognitive schema may replace the MindAgents altogether, leaving the system to operate almost entirely based on cognitive schemata. In order to make the process of cognitive schema learning easier, we may provide a number of elementary schemata embodying the basic cognitive processes contained in the MindAgents. Of course, cognitive schemata need not use these they may embody entirely different cognitive processes than the MindAgents. Eventually, we want the system to discover better ways of doing things than anything even hinted at by its initial MindAgents. But for the initial phases or the system’s schema learning, it will have a much easier time learning to use the basic cognitive operations as the initial MindAgents, rather than inventing new ways of thinking from scratch! For instance, we may provide elementary schemata corresponding to inference operations, such as schemes Deduction Input InheritanceLink: X, Y Output InheritanceLink The inference MindAgents apply this rule in certain ways, designed to be reasonably effective in a variety of situations. But there are certainly other ways of using the deduction rule, outside of the basic control strategies embodied in the inference MindAgents. By learning schemata in- volving the Deduction schema, the system can learn special, context-specific rules for combining deduction with concept-formation, association-formation and other cognitive processes. And as it gets smarter, it can then take these schemata involving the Deduction schema, and replace it with a new schema that eg. contains a context-appropriate deduction formula. Eventually, to support cognitive schema learning, we will want to cast the hard-wired MindA- gents as cognitive schemata, so the system can see what is going on inside them. Pragmatically, what this requires is coding versions of the MindAgents in Combo (see Chapter 21 of Part 2) rather than C++, so they can be treated like any other cognitive schemata; or alternately, rep- resenting them as declarative Atoms in the Atomspace. Figure 18.1 illustrates the possibility of representing the PLN deduction rule in the Atomspace rather than as a hard-wired procedure coded in C++. But even prior to this kind of fully cognitively transparent implementation, the system can still reason about its use of different mind dynamics by considering each MindAgent as a virtual Procedure with a real SchemaNode attached to it. This can lead to some valuable learning, with the obvious limitation that in this approach the system is thinking about its MindAgents as black boxes rather than being equipped with full knowledge of their internals. 18.3 Self-Modification via Supercompilation Now we turn to a very different form of advanced self-modification: supercompilation. Super- compilation "merely" enables procedures to run much, much faster than they otherwise would. This is in a sense weaker than self-modication methods that fundamentally create new algo- rithms, but it shouldn’t be underestimated. A 50x speedup in some cognitive process can enable that process to give much smarter answers, which can then elicit different behaviors from the world or from other cognitive processes, thus resulting in a qualitatively different overall cog- nitive dynamic. HOUSE_OVERSIGHT_013235
320 18 Advanced Self-Modification: A Possible Path to Superhuman AGI Fido implication — Dog implication Nice PLN Deduction Rule (hard-coded in C++ or Python) ; (8) Fido. : —— > Nice implication - implication i icati N PLN is a set Fido P = Dog Dog bal ee Nice of Schema Nodes that can modify the Atomspace, including modifying PLN rules via their declarative AND versions Declarative representation of PLN deduction rule in atomspace Execution Output Link %, ‘. Schema Node PLN deduction Rule List MOSES can learn new versions of the PLN deduction rule 2 Ane, Fido 2 Nice * (8) PLN deduction rule Represented as a program Tree in the Procedure Repository () Fig. 18.1: Representation of PLN Deduction Rule as Cognitive Content. Top: the current, hard-coded representation of the deduction rule. Bottom: representation of the same rule in the Atomspace as cognitive content, susceptible to analysis and improvement by the system’s own cognitive processes. Furthermore, we suspect that the internal representation of programs used for supercompila- tion is highly relevant for other kinds of self-modification as well. Supercompilation requires one kind of reasoning on complex programs, and goal-directed program creation requires another, but both, we conjecture, can benefit from the same way of looking at programs. HOUSE_OVERSIGHT_013236
18.3 Sel&Modification via Supercompilation 321 Supercompilation is an innovative and general approach to global program optimization initially developed by Valentin Turchin. In its simplest form, it provides an algorithm that takes in a piece of software and output another piece of software that does the same thing, but far faster and using less memory. It was introduced to the West in Turchin’s 1986 technical paper “The concept of a supercompiler” [TV 96], and since this time the concept has been avidly developed by computer scientists in Russia, America, Denmark and other nations. Prior to 1986, a great deal of work on supercompilation was carried out and published in Russia; and Valentin Turchin, Andrei Klimov and their colleagues at the Keldysh Institute in Russia developed a supercompiler for the Russian programming language Refal. Since 1998 these researchers and their team at Supercompilers LLC have been working to replicate their achievement for the more complicated but far more commercially significant language Java. It is a large project and completion is scheduled for early 2003. But even at this stage, their partially complete Java supercompiler has had some interesting practical successes — including the use of the supercompiler to produce efficient Java code from CogPrime combinator trees. The radical nature of supercompilation may not be apparent to those unfamiliar with the usual art of automated program optimization. Most approaches to program optimization involve some kind of direct program transformation. A program is transformed, by the step by step application of a series of equivalences, into a different program, hopefully a more efficient one. Supercompilation takes a different approach. A supercompiler studies a program and constructs a model of the program’s dynamics. This model is in a special mathematical form, and it can, in most cases, be used to create an efficient program doing the same thing as the original one. The internal behavior of the supercompiler is, not surprisingly, quite complex; what we will give here is merely a brief high-level summary. For an accessible overview of the supercompila- tion algorithm, the reader is referred to the article “What is Supercompilation?” [1] 18.3.1 Three Aspects of Supercompilation There are three separate levels to the supercompilation idea: first, a general philosophy; second a translation of this philosophy into a concrete algorithmic framework; and third, the manifold details involved making this algorithmic framework practicable in a particular programming language. The third level is much more complicated in the Java context than it would be for Sasha, for example. The key philosophical concept underlying the supercompiler is that of a metasystem transi- tion. In general, this term refers to a transition in which a system that previously had relatively autonomous control, becomes part of a larger system that exhibits significant controlling influ- ence over it. For example, in the evolution of life, when cells first become part of a multicellular organism, there was a metasystem transition, in that the primary nexus of control passed from the cellular level to the organism level. The metasystem transition in supercompilation consists of the transition from considering a program in itself, to considering a metaprogram which executes another program, treating its free variables and their interdependencies as a subject for its mathematical analysis. In other words, a metaprogram is a program that accepts a program as input, and then runs this program, keeping the inputs in the form of free variables, doing analysis along the way based on the way the program depends on these variables, and doing optimization based on this analysis. A CogPrime schema does not explicitly contain variables, but the inputs to the HOUSE_OVERSIGHT_013237
322 18 Advanced Self-Modification: A Possible Path to Superhuman AGI schema are implicitly variables — they vary from one instance of schema execution to the next — and may be treated as such for supercompilation purposes. The metaprogram executes a program without assuming specific values for its input variables, creating a tree as it goes along. Each time it reaches a statement that can have different results depending on the values of one or more variables, it creates a new node in the tree. This part of the supercompilation algorithm is called driving — a process which, on its own, would create a very large tree, corresponding to a rapidly-executable but unacceptably humongous version of the original program. In essence, driving transforms a program into a huge “decision tree”, wherein each input to the program corresponds to a single path through the tree, from the root to one of the leaves. As a program input travels through the tree, it is acted on by the atomic program step living at each node. When one of the leaves is reached, the pertinent leaf node computes the output value of the program. The other part of supercompilation, configuration analysis, is focused on dynamically reduc- ing the size of the tree created by driving, by recognizing patterns among the nodes of the tree and taking steps like merging nodes together, or deleting redundant subtrees. Configuration analysis transforms the decision tree created by driving into a decision graph, in which the paths taken by different inputs may in some cases begin separately and then merge together. Finally, the graph that the metaprogram creates is translated back into a program, embody- ing the constraints implicit in the nodes of the graph. This program is not likely to look anything like the original program that the metaprogram started with, but it is guaranteed to carry out the same function [NOTE: Give a graphical representation of the decision graph corresponding to the supercompiled binary search program for L=4, described above.]. 18.3.2 Supercompilation for Goal-Directed Program Modification Supercompilation, as conventionally envisioned, is about making programs run faster; and as noted above, it will almost certainly be useful for this purpose within CogPrime. But the process of program modeling embedded in the supercompilation process, is poten- tially of great value beyond the quest for faster software. The decision graph representation of a program, produced in the course of supercompilation, may be exported directly into CogPrime as a set of logical relationships. Essentially, each node of the supercompiler’s internal decision graph looks like: Input: List L Output: List If ( P1(L) ) NI(L) Else If ( P2(L) ) N2(L) Else If ( Pk(L) ) Nk(L) HOUSE_OVERSIGHT_013238
18.4 SelfModification via Theorem-Proving 323 where the Pi are predicates, and the Ni are schemata corresponding to other nodes of the decision graph (children of the current node). Often the Pi are very simple, implementing for instance numerical inequalities or Boolean equalities. Once this graph has been exported into CogPrime, it can be reasoned on, used as raw material for concept formation and predicate formation, and otherwise cognized. Supercompilation pure and simple does not change the I/O behavior of the input program. However, the decision graph produced during supercompilation, may be used by CogPrime cognition in order to do so. One then has a hybrid program-modification method composed of two phases: supercompilation for transforming programs into decision graphs, and CogPrime cognition for modifying decision graphs so that they can have different I/O behaviors fulfilling system goals even better than the original. Furthermore, it seems likely that, in many cases, it may be valuable to have the super- compiler feed many different decision-graph representations of a program into CogPrime. The supercompiler has many internal parameters, and varying them may lead to significantly differ- ent decision graphs. The decision graph leading to maximal optimization, may not be the one that leads CogPrime cognition in optimal directions. 18.4 Self-Modification via Theorem-Proving Supercompilation is a potentially very valuable tool for self-modification. If one wants to take an existing schema and gradually improve it for speed, or even for greater effectiveness at achieving current goals, supercompilation can potentially do that most excellently. However, the representation that supercompilation creates for a program is very “surface- level.” No one could read the supercompiled version of a program and understand what it was doing. Really deep self-invented AI innovation requires, we believe, another level of self modification beyond that provided by supercompilation. This other level, we believe, is best formulated in terms of theorem-proving [RVO|]. Deep selfmodification could be achieved if CogPrime were capable of proving theorems of a certain form: namely, theorems about the spacetime complexity and accuracy of particular compound schemata, on average, assuming realistic probability distributions on the inputs, and making appropriate independence assumptions. These are not exactly the types of theorems that are found in human-authored mathematics papers. By and large they will be nasty, complex theorems, not the sort that many human mathematicians enjoy proving or reading. But of course, there is always the possibility that some elegant gem of a discovery could emerge from this sort of highly detailed theorem-proving work. In order to guide it in the formulation of theorems of this nature, the system will have empirical data on the spacetime complexity of elementary schemata, and on the probability distributions of inputs to schemata. It can embed these data in axioms, by asking: Assuming the component elementary schemata have complexities within these bounds, and the input pdf (probability distribution function) is between these bounds, then what is the pdf of the complexity and accuracy of this compound schema? Of course, this is not an easy sort of question in general: one can have schemata embodying any sort of algorithm, including complex algorithms on which computer science professors might write dozens of research articles. But the system must build up its ability to prove such things incrementally, step by step. HOUSE_OVERSIGHT_013239
324 18 Advanced Self-Modification: A Possible Path to Superhuman AGI We envision teaching the system to prove theorems via a combination of supervised learning and experiential interactive learning, using the Mizar database of mathematical theorems and proofs (or some other similar database, if one should be created) (nttp://mizar.org). The Mizar database consists of a set of “articles,” which are mathematical theorems and proofs presented in a complex formal language. The Mizar formal language occupies a fascinating middle ground: it is high-level enough to be viably read and written by trained humans, but it can be unambiguously translated into simpler formal languages such as predicate logic or Sasha. CogPrime may be taught to prove theorems by “training” it on the Mizar theorems and proofs, and by training it on custom-created Mizar articles specifically focusing on the sorts of theorems useful for self-modification. Creating these articles will not be a trivial task: it will require proving simple and then progressively more complex theorems about the probabilistic success of CogPrime schemata, so that CogPrime can observe one’s proofs and learned from them. Having learned from its training articles what strategies work for proving things about simple compound schemata, it can then reason by analogy to mount attacks on slightly more complex schemata — and so forth. Clearly, this approach to self-modification is more difficult to achieve than the supercompi- lation approach. But it is also potentially much more powerful. Even once the theorem-proving approach is working, the supercompilation approach will still be valuable, for making incremen- tal improvements on existing schema, and for the peculiar creativity that is contributed when a modified supercompiled schema is compressed back into a modified schema expression. But, we don’t believe that supercompilation can carry out truly advanced MindAgent learning or knowledge-representation modification. We suspect that the most advanced and ambitious goals of self-modification probably cannot be achieved except through some variant of the theorem- proving approach. If this hypothesis is true, it means that truly advanced self-modification is only going to come after relatively advanced theorem-proving ability. Prior to this, we will have schema optimization, schema modification, and occasional creative schema innovation. But re- ally systematic, high-quality reasoning about schema, the kind that can produce an orders of magnitude improvement in intelligence, is going to require advanced mathematical theorem- proving ability. HOUSE_OVERSIGHT_013240
Appendix A Glossary A.1 List of Specialized Acronyms This includes acronyms that are commonly used in discussing CogPrime, OpenCog and related ideas, plus some that occur here and there in the text for relatively ephemeral reasons. AA: Attention Allocation ADF: Automatically Defined Function (in the context of Genetic Programming) AF: Attentional Focus AGI: Artificial General Intelligence AV: Attention Value BD: Behavior Description C-space: Configuration Space CBV: Coherent Blended Volition CEV: Coherent Extrapolated Volition CGGP: Contextually Guided Greedy Parsing CSDLN: Compositional Spatiotemporal Deep Learning Network CT: Combo Tree ECAN: Economic Attention Network ECP: Embodied Communication Prior EPW : Experiential Possible Worlds (semantics) FCA: Formal Concept Analysis FI : Fisher Information FIM: Frequent Itemset Mining FOI: First Order Inference FOPL: First Order Predicate Logic FOPLN: First Order PLN FS-MOSES: Feature Selection MOSES (i.e. MOSES with feature selection integrated a la LIFES) GA: Genetic Algorithms 325 HOUSE_OVERSIGHT_013241
w NO fon A Glossary GB: Global Brain GEOP: Goal Evaluator Operating Procedure (in a GOLEM context) GIS: Geospatial Information System GOLEM: Goal-Oriented LEarning Meta-architecture GP: Genetic Programming HOI: Higher-Order Inference HOPLN: Higher-Order PLN HR: Historical Repository (in a GOLEM context) HTM: Hierarchical Temporal Memory IA: (Allen) Interval Algebra (an algebra of temporal intervals) IRC: Imitation / Reinforcement / Correction (Learning) LIFES: Learning-Integrated Feature Selection LTI: Long Term Importance MA: MindAgent MOSES: Meta-Optimizing Semantic Evolutionary Search MSH: Mirror System Hypothesis NARS: Non-Axiomatic Reasoning System NLGen: A specific software component within OpenCog, which provides one way of dealing with Natural Language Generation OCP: OpenCogPrime OP: Operating Program (in a GOLEM context) PEPL: Probabilistic Evolutionary Procedure Learning (e.g. MOSES) PLN: Probabilistic Logic Networks RCC: Region Connection Calculus RelEx: A specific software component within OpenCog, which provides one way of dealing with natural language Relationship Extraction SAT: Boolean SATisfaction, as a mathematical / computational problem SMEPH: Self-Modifying Evolving Probabilistic Hypergraph SRAM: Simple Realistic Agents Model STI: Short Term Importance STV: Simple Truth VAlue TV: Truth Value VLTI: Very Long Term Importances WSPS: Whole-Sentence Purely-Syntactic Parsing A.2 Glossary of Specialized Terms Abduction: A general form of inference that goes from data describing something to a hypothesis that accounts for the data. Often in an OpenCog context, this refers to the PLN abduction rule, a specific First-Order PLN rule (If A implies C, and B implies C, then maybe A is B), which embodies a simple form of abductive inference. But OpenCog may also carry out abduction, as a general process, in other ways. Action Selection: The process via which the OpenCog system chooses which Schema to enact, based on its current goals and context. Active Schema Pool: The set of Schema currently in the midst of Schema Execution. HOUSE_OVERSIGHT_013242
Glossary of Specialized Terms 327 Adaptive Inference Control: Algorithms or heuristics for guiding PLN inference, that cause inference to be guided differently based on the context in which the inference is taking place, or based on aspects of the inference that are noted as it proceeds. AGI Preschool: A virtual world or robotic scenario roughly similar to the environment within a typical human preschool, intended for AGIs to learn in via interacting with the environment and with other intelligent agents. Atom: The basic entity used in OpenCog as an element for building representations. Some Atoms directly represent patterns in the world or mind, others are components of represen- tations. There are two kinds of Atoms: Nodes and Links. e Atom, Frozen: See Atom, Saved e Atom, Realized: An Atom that exists in RAM at a certain point in time. e Atom, Saved: An Atom that has been saved to disk or other similar media, and is not actively being processed. Atom, Serialized: An Atom that is serialized for transmission from one software process to another, or for saving to disk, etc. Atom2Link: A part of OpenCogPrime s language generation system, that transforms appropriate Atoms into words connected via link parser link types. Atomspace: A collection of Atoms, comprising the central part of the memory of an OpenCog instance. Attention: The aspect of an intelligent system’s dynamics focused on guiding which aspects of an OpenCog system’s memory & functionality gets more computational resources at a certain point in time Attention Allocation: The cognitive process concerned with managing the parameters and relationships guiding what the system pays attention to, at what points in time. This is a term inclusive of Importance Updating and Hebbian Learning. Attentional Currency: Short Term Importance and Long Term Importance values are implemented in terms of two different types of artificial money, STICurrency and LTICur- rency. Theoretically these may be converted to one another. Attentional Focus: The Atoms in an OpenCog Atomspace whose ShortTermImportance values lie above a critical threshold (the AttentionalFocus Boundary). The Attention Allo- cation subsystem treats these Atoms differently. Qualitatively, these Atoms constitute the system’s main focus of attention during a certain interval of time, i.e. it’s a moving bubble of attention. Attentional Memory: A system’s memory of what it’s useful to pay attention to, in what contexts. In CogPrime this is managed by the attention allocation subsystem. Backward Chainer: A piece of software, wrapped in a MindAgent, that carries out back- ward chaining inference using PLN. CIM-Dynamic: Concretely-Implemented Mind Dynamic, a term for a cognitive process that is implemented explicitly in OpenCog (as opposed to allowed to emerge implicitly from other dynamics). Sometimes a CIM-Dynamic will be implemented via a single MindAgent, sometimes via a set of multiple interrelated MindAgents, occasionally by other means. Cognition: In an OpenCog context, this is an imprecise term. Sometimes this term means any process closely related to intelligence; but more often it’s used specifically to refer to more abstract reasoning /learning/etc, as distinct from lower-level perception and action. Cognitive Architecture: This refers to the logical division of an AI system like OpenCog into interacting parts and processes representing different conceptual aspects of intelligence. HOUSE_OVERSIGHT_013243
328 A Glossary It’s different from the software architecture, though of course certain cognitive architectures and certain software architectures fit more naturally together. e Cognitive Cycle: The basic “loop” of operations that an OpenCog system, used to control an agent interacting with a world, goes through rapidly each "subjective moment.” Typically a cognitive cycle should be completed in a second or less. It minimally involves perceiving data from the world, storing data in memory, and deciding what if any new actions need to be taken based on the data perceived. It may also involve other processes like deliber- ative thinking or metacognition. Not all OpenCog processing needs to take place within a cognitive cycle. e Cognitive Schematic: An implication of the form "Context AND Procedure IMPLIES goal”. Learning and utilization of these is key to CogPrime’s cognitive process. e Cognitive Synergy: The phenomenon by which different cognitive processes, controlling a single agent, work together in such a way as to help each other be more intelligent. Typically, if one has cognitive processes that are individually susceptible to combinatorial explosions, cognitive synergy involves coupling them together in such a way that they can help one another overcome each other’s internal combinatorial explosions. The CogPrime design is reliant on the hypothesis that its key learning algorithms will display dramatic cognitive synergy when utilized for agent control in appropriate environments. e CogPrime : The name for the AGI design presented in this book, which is designed specifi- cally for implementation within the OpenCog software framework (and this implementation is OpenCogPrime). e CogServer: A piece of software, within OpenCog, that wraps up an Atomspace and a number of MindAgents, along with other mechanisms like a Scheduler for controlling the activity of the MindAgents, and code for important and exporting data from the Atomspace. e Cognitive Equation: The principle, identified in Ben Goertzel’s 1994 book "Chaotic Logic", that minds are collections of pattern-recognition elements, that work by iteratively recognizing patterns in each other and then embodying these patterns as new system ele- ments. This is seen as distinguishing mind from "self-organization” in general, as the latter is not so focused on continual pattern recognition. Colloquially this means that "a mind is a system continually creating itself via recognizing patterns in itself.” e Combo: The programming language used internally by MOSES to represent the programs it evolves. SchemaNodes may refer to Combo programs, whether the latter are learned via MOSES or via some other means. The textual realization of Combo resembles LISP with less syntactic sugar. Internally a Combo program is represented as a program tree. e Composer: In the PLN design, a rule is denoted a composer if it needs premises for generating its consequent. See generator. e CogBuntu: an Ubuntu Linux remix that contains all required packages and tools to test and develop OpenCog. e Concept Creation: A general term for cognitive processes that create new ConceptNodes, PredicateNodes or concept maps representing new concepts. e Conceptual Blending: A process of creating new concepts via judiciously combining pieces of old concepts. This may occur in OpenCog in many ways, among them the explicit use of a ConceptBlending MindAgent, that blends two or more ConceptNodes into a new one. e Confidence: A component of an OpenCog/PLN TruthValue, which is a scaling into the interval [0,1] of the weight of evidence associated with a truth value. In the simplest case (of a probabilistic Simple Truth Value), one uses confidence c = n / (n+k), where n is HOUSE_OVERSIGHT_013244
Glossary of Specialized Terms 329 the weight of evidence and k is a parameter. In the case of an Indefinite Truth Value, the confidence is associated with the width of the probability interval. Confidence Decay: The process by which the confidence of an Atom decreases over time, as the observations on which the Atom’s truth value is based become increasingly obsolete. This may be carried out by a special MindAgent. The rate of confidence decay is subtle and contextually determined, and must be estimated via inference rather than simply assumed a priori. Consciousness: CogPrime is not predicated on any particular conceptual theory of con- sciousness. Informally, the AttentionalFocus is sometimes referred to as the “conscious” mind of a CogPrime system, with the rest of the Atomspace as “unconscious” but this is just an informal usage, not intended to tie the CogPrime design to any particular theory of consciousness. The primary originator of the CogPrime design (Ben Goertzel) tends toward panpsychism, as it happens. Context: In addition to its general common-sensical meaning, in CogPrime the term Con- text also refers to an Atom that is used as the first argument of a ContextLink. The second argument of the ContextLink then contains Links or Nodes, with TruthValues calculated restricted to the context defined by the first argument. For instance, (ContextLink USA (InheritanceLink person obese )). Core: The MindOS portion of OpenCog, comprising the Atomspace, the CogServer, and other associated “infrastructural” code. Corrective Learning: When an agent learns how to do something, by having another agent explicitly guide it in doing the thing. For instance, teaching a dog to sit by pushing its butt to the ground. CSDLN: (Compositional Spatiotemporal Deep Learning Network): A hierarchical pattern recognition network, in which each layer corresponds to a certain spatiotemporal granularity, the nodes on a given layer correspond to spatiotemporal regions of a given size, and the children of a node correspond to sub-regions of the region the parent corresponds to. Jeff Hawkins’s HTM is one example CSDLN, and Itamar Arel’s DeSTIN (currently used in OpenCog) is another. Declarative Knowledge: Semantic knowledge as would be expressed in propositional or predicate logic facts or beliefs. Deduction: In general, this refers to the derivation of conclusions from premises using logical rules. In PLN in particular, this often refers to the exercise of a specific inference rule, the PLN Deduction rule (A + B, B > C, therefore A> C) Deep Learning: Learning in a network of elements with multiple layers, involving feedfor- ward and feedback dynamics, and adaptation of the links between the elements. An example deep learning algorithm is DeSTIN, which is being integrated with OpenCog for perception processing. Defrosting: Restoring, into the RAM portion of an Atomspace, an Atom (or set thereof) previously saved to disk. Demand: In CogPrime’s OpenPsi subsystem, this term is used in a manner inherited from the Psi model of motivated action. A Demand in this context is a quantity whose value the system is motivated to adjust. Typically the system wants to keep the Demand between certain minimum and maximum values. An Urge develops when a Demand deviates from its target range. Deme: In MOSES, an “island” of candidate programs, closely clustered together in program space, being evolved in an attempt to optimize a certain fitness function. The idea is that HOUSE_OVERSIGHT_013245
330 A Glossary within a deme, programs are generally similar enough that reasonable syntax-semantics correlation obtains. e Derived Hypergraph: The SMEPH hypergraph obtained via modeling a system in terms of a hypergraph representing its internal states and their relationships. For instance, a SMEPH vertex represents a collection of internal states that habitually occur in relation to similar external situations. A SMEPH edge represents a relationship between two SMEPH vertices (e.g. a similarity or inheritance relationship). The terminology "edge /vertex” is used in this context, to distinguish from the “link / node” terminology used in the context of the Atomspace. e DeSTIN — Deep SpatioTemporal Inference Network: A specific CSDLN created by Itamar Arel, tested on visual perception, and appropriate for integration within CogPrime. e Dialogue: Linguistic interaction between two or more parties. In a CogPrime context, this may be in English or another natural language, or it may be in Lojban or Psynese. e Dialogue Control: The process of determining what to say at each juncture in a dialogue. This is distinguished from the linguistic aspects of dialogue, language comprehension and language generation. Dialogue control applies to Psynese or Lojban, as well as to human natural language. e Dimensional Embedding: The process of embedding entities from some non-dimensional space (e.g. the Atomspace) into an n-dimensional Euclidean space. This can be useful in an AI context because some sorts of queries (e.g. “find everything similar to X”, “find a path between X and Y”) are much faster to carry out among points in a Euclidean space, than among entities in a space with less geometric structure. e Distributed Atomspace: An implementation of an Atomspace that spans multiple com- putational processes; generally this is done to enable spreading an Atomspace across mul- tiple machines. e Dual Network: A network of mental or informational entities with both a hierarchical structure and a heterarchical structure, and an alignment among the two structures so that each one helps with the maintenance of the other. This is hypothesized to be a critical emergent structure, that must emerge in a mind (e.g. in an Atomspace) in order for it to achieve a reasonable level of human-like general intelligence (and possibly to achieve a high level of pragmatic general intelligence in any physical environment). e Efficient Pragmatic General Intelligence: A formal, mathematical definition of general intelligence (extending the pragmatic general intelligence), that ultimately boils down to: the ability to achieve complex goals in complex environments using limited computational resources (where there is a specifically given weighting function determining which goals and environments have highest priority). More specifically, the definition weighted-sums the system’s normalized goal-achieving ability over (goal, environment pairs), and where the weights are given by some assumed measure over (goal, environment pairs), and where the normalization is done via dividing by the (space and time) computational resources used for achieving the goal. e Elegant Normal Form (ENF): Used in MOSES, this is a way of putting programs in a normal form while retaining their hierarchical structure. This is critical if one wishes to probabilistically model the structure of a collection of programs, which is a meaningful operation if the collection of programs is operating within a region of program space where syntax-semantics correlation holds to a reasonable degree. The Reduct library is used to place programs into ENF. HOUSE_OVERSIGHT_013246
Glossary of Specialized Terms 331 Embodied Communication Prior: The class of prior distributions over (goal, environ- ment pairs), that are imposed by placing an intelligent system in an environment where most of its tasks involve controlling a spatially localized body in a complex world, and in- teracting with other intelligent spatially localized bodies. It is hypothesized that many key aspects of human-like intelligence (e.g. the use of different subsystems for different memory types, and cognitive synergy between the dynamics associated with these subsystems) are consequences of this prior assumption. This is related to the Mind-World Correspondence Principle. Embodiment: Colloquially, in an OpenCog context, this usually means the use of an AI software system to control a spatially localized body in a complex (usually 3D) world. There are also possible "borderline cases” of embodiment, such as a search agent on the Internet. In a sense any AI is embodied, because it occupies some physical system (e.g. computer hardware) and has some way of interfacing with the outside world. Emergence: A property or pattern in a system is emergent if it arises via the combination of other system components or aspects, in such a way that its details would be very difficult (not necessarily impossible in principle) to predict from these other system components or aspects. Emotion: Emotions are system-wide responses to the system’s current and predicted state. Dorner’s Psi theory of emotion contains explanations of many human emotions in terms of underlying dynamics and motivations, and most of these explanations make sense in a CogPrime context, due to CogPrime’s use of OpenPsi (modeled on Psi) for motivation and action selection. Episodic Knowledge: Knowledge about episodes in an agent’s life-history, or the life- history of other agents. CogPrime includes a special dimensional embedding space only for episodic knowledge, easing organization and recall. Evolutionary Learning: Learning that proceeds via the rough process of iterated differen- tial reproduction based on fitness, incorporating variations of reproduced entities. MOSES is an explicitly evolutionary-learning-based portion of CogPrime; but CogPrime’s dynamics as a whole may also be conceived as evolutionary. Exemplar: (in the context of imitation learning) - When the owner wants to teach an OpenCog controlled agent a behavior by imitation, he/she gives the pet an exemplar. To teach a virtual pet "fetch" for instance, the owner is going to throw a stick, run to it, grab it with his/her mouth and come back to its initial position. Exemplar: (in the context of MOSES) — Candidate chosen as the core of a new deme, or as the central program within a deme, to be varied by representation building for ongoing exploration of program space. Explicit Knowledge Representation: Knowledge representation in which individual, easily humanly identifiable pieces of knowledge correspond to individual elements in a knowl- edge store (elements that are explicitly there in the software and accessible via very rapid, deterministic operations) Extension: In PLN, the extension of a node refers to the instances of the category that the node represents. In contrast is the intension. Fishgram (Frequent and Interesting Sub-hypergraph Mining): A pattern mining algorithm for identifying frequent and/or interesting sub-hypergraphs in the Atomspace. First-Order Inference (FOI): The subset of PLN that handles Logical Links not in- volving VariableAtoms or higher-order functions. The other aspect of PLN, Higher-Order Inference, uses Truth Value formulas derived from First-Order Inference. HOUSE_OVERSIGHT_013247
332 A Glossary e Forgetting: The process of removing Atoms from the in-RAM portion of Atomspace, when RAM gets short and they are judged not as valuable to retain in RAM as other Atoms. This is commonly done using the LTT values of the Atoms (removing lowest LTI-Atoms, or more complex strategies involving the LTI of groups of interconnected Atoms). May be done by a dedicated Forgetting MindAgent. VLTI may be used to determine the fate of forgotten Atoms. e Forward Chainer: A control mechanism (MindAgent) for PLN inference, that works by taking existing Atoms and deriving conclusions from them using PLN rules, and then iter- ating this process. The goal is to derive new Atoms that are interesting according to some given criterion. e Frame2Atom: A simple system of hand-coded rules for translating the output of RelEx2Frame (logical representation of semantic relationships using FrameNet relationships) into Atoms. e Freezing: Saving Atoms from the in-RAM Atomspace to disk. e General Intelligence: Often used in an informal, commonsensical sense, to mean the ability to learn and generalize beyond specific problems or contexts. Has been formalized in various ways as well, including formalizations of the notion of ’achieving complex goals in complex environments” and "achieving complex goals in complex environments using limited resources.” Usually interpreted as a fuzzy concept, according to which absolutely general intelligence is physically unachievable, and humans have a significant level of general intelligence, but far from the maximally physically achievable degree. e Generalized Hypergraph: A hypergraph with some additional features, such as links that point to links, and nodes that are seen as “containing” whole sub-hypergraphs. This is the most natural and direct way to mathematically/visually model the Atomspace. e Generator: In the PLN design, a rule is denoted a generator if it can produce its consequent without needing premises (e.g. LookupRule, which just looks it up in the AtomSpace). See composer. e Global, Distributed Memory: Memory that stores items as implicit knowledge, with each memory item spread across multiple components, stored as a pattern of organization or activity among them. e Glocal Memory: The storage of items in memory in a way that involves both localized and global, distributed aspects. e Goal: An Atom representing a function that a system (like OpenCog) is supposed to spend a certain non-trivial percentage of its attention optimizing. The goal, informally speaking, is to maximize the Atom’s truth value. e Goal, Implicit: A goal that an intelligent system, in practice, strives to achieve; but that is not explicitly represented as a goal in the system’s knowledge base. e Goal, Explicit: A goal that an intelligent system explicitly represents in its knowledge base, and expends some resources trying to achieve. Goal Nodes (which may be Nodes or, e.g. ImplicationLinks) are used for this purpose in OpenCog. e Goal-Driven Learning: Learning that is driven by the cognitive schematic i.e. by the quest of figuring out which procedures can be expected to achieve a certain goal in a certain sort of context. e Grounded SchemaNode: See SchemaNode, Grounded. e Hebbian Learning: An aspect of Attention Allocation, centered on creating and updating HebbianLinks, which represent the simultaneous importance of the Atoms joined by the HebbianLink. HOUSE_OVERSIGHT_013248
Glossary of Specialized Terms 333 Hebbian Links: Links recording information about the associative relationship (co- occurrence) between Atoms. These include symmetric and asymmetric HebbianLinks. Heterarchical Network: A network of linked elements in which the semantic relationships associated with the links are generally symmetrical (e.g. they may be similarity links, or symmetrical associative links). This is one important sort of subnetwork of an intelligent system; see Dual Network. Hierarchical Network: A network of linked elements in which the semantic relationships associated with the links are generally asymmetrical, and the parent nodes of a node have a more general scope and some measure of control over their children (though there may be important feedback dynamics too). This is one important sort of subnetwork of an intelligent system; see Dual Network. Higher-Order Inference (HOI): PLN inference involving variables or higher-order func- tions. In contrast to First-Order Inference (FOI). Hillclimbing: A general term for greedy, local optimization techniques, including some relatively sophisticated ones that involve *mildly nonlocal” jumps. Human-Level Intelligence: General intelligence that’s ’as smart as’ human general in- telligence, even if in some respects quite unlike human intelligence. An informal concept, which generally doesn’t come up much in CogPrime work, but is used frequently by some other AI theorists. Human-Like Intelligence: General intelligence with properties and capabilities broadly resembling those of humans, but not necessarily precisely imitating human beings. Hypergraph: A conventional hypergraph is a collection of nodes and links, where each link may span any number of nodes. OpenCog makes use of generalized hypergraphs (the Atomspace is one of these). e Imitation Learning: Learning via copying what some other agent is observed to do. e Implication: Often refers to an ImplicationLink between two PredicateNodes, indicating an (extensional, intensional or mixed) logical implication. Implicit Knowledge Representation: Representation of knowledge via having easily humanly identifiable pieces of knowledge correspond to the pattern of organization and/or dynamics of elements, rather than via having individual elements correspond to easily hu- manly identifiable pieces of knowledge. Importance: A generic term for the Attention Values associated with Atoms. Most com- monly these are STI (short term importance) and LTI (long term importance) values. Other importance values corresponding to various different time scales are also possible. In general an importance value reflects an estimate of the likelihood an Atom will be useful to the system over some particular future time-horizon. STI is generally relevant to processor time allocation, whereas LTT is generally relevant to memory allocation. Importance Decay: The process of Atom importance values (e.g. STI and LTT) decreasing over time, if the Atoms are not utilized. Importance decay rates may in general be context- dependent. Importance Spreading: A synonym for Importance Updating, intended to highlight the similarity with “activation spreading” in neural and semantic networks. Importance Updating: The CIM-Dynamic that periodically (frequently) updates the STI and LTI values of Atoms based on their recent activity and their relationships. Imprecise Truth Value: Peter Walley’s imprecise truth values are intervals [L,U], inter- preted as lower and upper bounds of the means of probability distributions in an envelope HOUSE_OVERSIGHT_013249
334 A Glossary of distributions. In general, the term may be used to refer to any truth value involving intervals or related constructs, such as indefinite probabilities. e Indefinite Probability: An extension of a standard imprecise probability, comprising a credible interval for the means of probability distributions governed by a given second-order distribution. e Indefinite Truth Value: An OpenCog TruthValue object wrapping up an indefinite prob- ability e Induction: In PLN, a specific inference rule (A > B, A > C, therefore B > C). In general, the process of heuristically inferring that what has been seen in multiple examples, will be seen again in new examples. Induction in the broad sense, may be carried out in OpenCog by methods other than PLN induction. When emphasis needs to be laid on the particular PLN inference rule, the phrase "PLN Induction” is used. e Inference: Generally speaking, the process of deriving conclusions from assumptions. In an OpenCog context, this often refers to the PLN inference system. Inference in the broad sense is distinguished from general learning via some specific characteristics, such as the intrinsically incremental nature of inference: it proceeds step by step. e Inference Control: A cognitive process that determines what logical inference rule (e.g. what PLN rule) is applied to what data, at each point in the dynamic operation of an inference process. e Integrative AGI: An AGI architecture, like CogPrime, that relies on a number of different powerful, reasonably general algorithms all cooperating together. This is different from an AGI architecture that is centered on a single algorithm, and also different than an AGI architecture that expects intelligent behavior to emerge from the collective interoperation of a number of simple elements (without any sophisticated algorithms coordinating their overall behavior). e Integrative Cognitive Architecture: A cognitive architecture intended to support inte- grative AGI. e Intelligence: An informal, natural language concept. "General intelligence” is one slightly more precise specification of a related concept; "Universal intelligence” is a fully precise specification of a related concept. Other specifications of related concepts made in the particular context of CogPrime research are the pragmatic general intelligence and the efficient pragmatic general intelligence. e Intension: In PLN, the intention of a node consists of Atoms representing properties of the entity the node represents. e Intentional memory: A system’s knowledge of its goals and their subgoals, and associa- tions between these goals and procedures and contexts (e.g. cognitive schematics). e Internal Simulation World: A simulation engine used to simulate an external environ- ment (which may be physical or virtual), used by an AGI system as its "mind’s eye” in order to experiment with various action‘ q sequences and envision their consequences, or observe the consequences of various hypothetical situations. Particularly important for dealing with episodic knowledge. e Interval Algebra: Allen Interval Algebra, a mathematical theory of the relationships be- tween time intervals. CogPrime utilizes a fuzzified version of classic Interval Algebra. e IRC Learning (Imitation, Reinforcement, Correction): Learning via interaction with a teacher, involving a combination of imitating the teacher, getting explicit reinforcement signals from the teacher, and having one’s incorrect or suboptimal behaviors guided toward betterness by the teacher in real-time. This is a large part of how young humans learn. HOUSE_OVERSIGHT_013250
Glossary of Specialized Terms 335 Knowledge Base: A shorthand for the totality of knowledge possessed by an intelligent system during a certain interval of time (whether or not this knowledge is explicitly rep- resented). Put differently: this is an intelligence’s total memory contents (inclusive of all types of memory) during an interval of time. Language Comprehension: The process of mapping natural language speech or text into a more “cognitive”, largely language-independent representation. In OpenCog this has been done by various pipelines consisting of dedicated natural language processing tools, e.g. a pipeline: text — Link Parser — RelEx > RelEx2Frame — Frame2Atom Atomspace; and alternatively a pipeline Link Parser > Link2Atom — Atomspace. It would also be possi- ble to do language comprehension purely via PLN and other generic OpenCog processes, without using specialized language processing tools. Language Generation: The process of mapping (largely language-independent) cognitive content into speech or text. In OpenCog this has been done by various pipelines consisting of dedicated natural language processing tools, e.g. a pipeline: Atomspace > NLGen — text; or more recently Atomspace — Atom2Link — surface realization — text. It would also be possible to do language generation purely via PLN and other generic OpenCog processes, without using specialized language processing tools. Language Processing: Processing of human language is decomposed, in CogPrime, into Language Comprehension, Language Generation, and Dialogue Control. Learning: In general, the process of a system adapting based on experience, in a way that increases its intelligence (its ability to achieve its goals). The theory underlying CogPrime doesn’t distinguish learning from reasoning, associating, or other aspects of intelligence. Learning Server: In some OpenCog configurations, this refers to a software server that performs “offline” learning tasks (e.g. using MOSES or hillclimbing), and is in communica- tion with an Operational Agent Controller software server that performs real-time agent control and dispatches learning tasks to and receives results from the Learning Server. Linguistic Links: A catch-all term for Atoms explicitly representing linguistic content, e.g. WordNode, SentenceNode, CharacterNode. Link: A type of Atom, representing a relationship among one or more Atoms. Links and Nodes are the two basic kinds of Atoms. Link Parser: A natural language syntax parser, created by Sleator and Temperley at Carnegie-Mellon University, and currently used as part of OpenCogPrime’s natural language comprehension and natural language generation system. Link2Atom: A system for translating link parser links into Atoms. It attempts to resolve precisely as much ambiguity as needed in order to translate a given assemblage of link parser links into a unique Atom structure. Lobe: A term sometimes used to refer to a portion of a distributed Atomspace that lives in a single computational process. Often different lobes will live on different machines. Localized Memory: Memory that stores each item using a small number of closely- connected elements. Logic: In an OpenCog context, this usually refers to a set of formal rules for translating certain combinations of Atoms into “conclusion” Atoms. The paradigm case at present is the PLN probabilistic logic system, but OpenCog can also be used together with other logics. Logical Links: Any Atoms whose truth values are primarily determined or adjusted via logical rules, e.g. PLN’s InheritanceLink, SimilarityLink, ImplicationLink, etc. The term isn’t usually applied to other links like HebbianLinks whose semantics isn’t primarily logic- HOUSE_OVERSIGHT_013251
336 A Glossary based, even though these other links can be processed via (e.g. PLN) logical inference via interpreting them logically. e Lojban: A constructed human language, with a completely formalized syntax and a highly formalized semantics, and a small but active community of speakers. In principle this seems an extremely good method for communication between humans and early-stage AGI sys- tems. e Lojban+-+: A variant of Lojban that incorporates English words, enabling more flexible expression without the need for frequent invention of new Lojban words. e Long Term Importance (LTI): A value associated with each Atom, indicating roughly the expected utility to the system of keeping that Atom in RAM rather than saving it to disk or deleting it. It’s possible to have multiple LTI values pertaining to different time scales, but so far practical implementation and most theory has centered on the option of a single LTI value. e LTI: Long Term Importance e Map: A collection of Atoms that are interconnected in such a way that they tend to be commonly active (i.e. to have high STI, e.g. enough to be in the AttentionalFocus, at the same time). e Map Encapsulation: The process of automatically identifying maps in the Atomspace, and creating Atoms that “encapsulate” them; the Atom encapsulation a map would link to all the Atoms in the map. This is a way of making global memory into local memory, thus making the system’s memory glocal and explicitly manifesting the “cognitive equation.” This may be carried out via a dedicated MapEncapsulation MindAgent. e Map Formation: The process via which maps form in the Atomspace. This need not be explicit; maps may form implicitly via the action of Hebbian Learning. It will commonly occur that Atoms frequently co-occurring in the AttentionalFocus, will come to be joined together in a map. e Memory Types: In CogPrime this generally refers to the different types of memory that are embodied in different data structures or processes in the CogPrime architecture, e.g. declarative (semantic), procedural, attentional, intentional, episodic, sen- sorimotor. e Mind-World Correspondence Principle: The principle that, for a mind to display efficient pragmatic general intelligence relative to a world, it should display many of the same key structural properties as that world. This can be formalized by modeling the world and mind as probabilistic state transition graphs, and saying that the categories implicit in the state transition graphs of the mind and world should be inter-mappable via a high- probability morphism. e Mind OS: A synonym for the OpenCog Core. e MindAgent: An OpenCog software object, residing in the CogServer, that carries out some processes in interaction with the Atomspace. A given conceptual cognitive process (e.g. PLN inference, Attention allocation, etc.) may be carried out by a number of different MindAgents designed to work together. e Mindspace: A model of the set of states of an intelligent system as a geometrical space, imposed by assuming some metric on the set of mind-states. This may be used as a tool for ormulating general principles about the dynamics of generally intelligent systems. e Modulators: Parameters in the Psi model of motivated, emotional cognition, that modu- ate the way a system perceives, reasons about and interacts with the world. HOUSE_OVERSIGHT_013252
Glossary of Specialized Terms 337 MOSES (Meta-Optimizing Semantic Evolutionary Search): An algorithm for proce- dure learning, which in the current implementation learns programs in the Combo language. MOSES is an evolutionary learning system, which differs from typical genetic programming systems in multiple aspects including: a subtler framework for managing multiple "demes” or “islands” of candidate programs; a library called Reduct for placing programs in Elegant Normal Form; and the use of probabilistic modeling in place of, or in addition to, mutation and crossover as means of determining which new candidate programs to try. Motoric: Pertaining to the control of physical actuators, e.g. those connected to a robot. May sometimes be used to refer to the control of movements of a virtual character as well. Moving Bubble of Attention: The Attentional Focus of a CogPrime system. Natural Language Comprehension: See Language Comprehension Natural Language Generation: See Language Generation Natural Language Processing (NLP): See Language Processing NLGen: Software for carrying out the surface realization phase of natural language gen- eration, via translating collections of RelEx output relationships into English sentences. Was made functional for simple sentences and some complex sentences; not currently under active development, as work has shifted to the related Atom2Link approach to language generation. Node: A type of Atom. Links and Nodes are the two basic kinds of Atoms. Nodes, math- ematically, can be thought of as "O-ary" links. Some types of Nodes refer to external or mathematical entities (e.g. WordNode, NumberNode); others are purely abstract, e.g. a ConceptNode is characterized purely by the Links relating it to other atoms. Grounded- PredicateNodes and GroundedSchemaNodes connect to explicitly represented procedures (sometimes in the Combo language); ungrounded PredicateNodes and SchemaNodes are abstract and, like ConceptNodes, purely characterized by their relationships. Node Probability: Many PLN inference rules rely on probabilities associated with Nodes. Node probabilities are often easiest to interpret in a specific context, e.g. the probability P(cat) makes obvious sense in the context of a typical American house, or in the context of the center of the sun. Without any contextual specification, P(A) is taken to mean the probability that a randomly chosen occasion of the system’s experience includes some instance of A. Novamente Cognition Engine (NCE): A proprietary proto-AGI software system, the predecessor to OpenCog. Many parts of the NCE were open-sourced to form portions of OpenCog, but some NCE code was not included in OpenCog; and now OpenCog includes multiple aspects and plenty of code that was not in NCE. OpenCog: A software framework intended for development of AGI systems, and also for narrow-AI application using tools that have AGI applications. Co-designed with the Cog- Prime cognitive architecture, but not exclusively bound to it. OpenCog Prime (OCP): The implementation of the CogPrime cognitive architecture within the OpenCog software framework. OpenPsi: CogPrime’s architecture for motivation-driven action selection, which is based on adapting Dorner’s Psi model for use in the OpenCog framework. Operational Agent Controller (OAC): In some OpenCog configurations, this is a soft- ware server containing a CogServer devoted to real-time control of an agent (e.g. a virtual world agent, or a robot). Background, offline learning tasks may then be dispatched to other software processes, e.g. to a Learning Server. HOUSE_OVERSIGHT_013253
338 A Glossary e Pattern: In a CogPrime context, the term “pattern” is generally used to refer to a process that produces some entity, and is judged simpler than that entity. e Pattern Mining: Pattern mining is the process of extracting an (often large) number of patterns from some body of information, subject to some criterion regarding which patterns are of interest. Often (but not exclusively) it refers to algorithms that are rapid or “greedy”, finding a large number of simple patterns relatively inexpensively. e Pattern Recognition: The process of identifying and representing a pattern in some substrate (e.g. some collection of Atoms, or some raw perceptual data, etc.). e Patternism: The philosophical principle holding that, from the perspective of engineering intelligent systems, it is sufficient and useful to think about mental processes in terms of (static and dynamical) patterns. e Perception: The process of understanding data from sensors. When natural language is ingested in textual format, this is generally not considered perceptual. Perception may be taken to encompass both pre-processing that prepares sensory data for ingestion into the Atomspace, processing via specialized perception processing systems like DeSTIN that are connected to the Atomspace, and more cognitive-level process within the Atomspace that is oriented toward understanding what has been sensed. e Piagetan Stages: A series of stages of cognitive development hypothesized by develop- mental psychologist Jean Piaget, which are easy to interpret in the context of developing CogPrime systems. The basic stages are: Infantile, Pre-operational, Concrete Operational and Formal. Post-formal stages have been discussed by theorists since Piaget and seem relevant to AGI, especially advanced AGI systems capable of strong self-modification. PLN: short for Probabilistic Logic Networks PLN, First-Order: See First-Order Inference PLN, Higher-Order: See Higher-Order Inference PLN Rules: A PLN Rule takes as input one or more Atoms (the “premises”, usually Links), and output an Atom that is a “logical conclusion” of those Atoms. The truth value of the consequence is determined by a PLN Formula associated with the Rule. e PLN Formulas: A PLN Formula, corresponding to a PLN Rule, takes the TruthValues corresponding to the premises and produces the TruthValue corresponding to the conclusion. A single Rule may correspond to multiple Formulas, where each Formula deals with a different sort of TruthValue. e Pragmatic General Intelligence: A formalization of the concept of general intelligence, based on the concept that general intelligence is the capability to achieve goals in environ- ments, calculated as a weighted average over some fuzzy set of goals and environments. e Predicate Evaluation: The process of determining the Truth Value of a predicate, embod- ied in a PredicateNode. This may be recursive, as the predicate referenced internally by a Grounded PredicateNode (and represented via a Combo program tree) may itself internally reference other PredicateNodes. e Probabilistic Logic Networks (PLN): A mathematical and conceptual framework for reasoning under uncertainty, integrating aspects of predicate and term logic with extensions of imprecise probability theory. OpenCogPrime’s central tool for symbolic reasoning. e Procedural Knowledge: Knowledge regarding which series of actions (or action-combinations) are useful for an agent to undertake in which circumstances. In CogPrime these may be learned in a number of ways, e.g. via PLN or via Hebbian learning of Schema Maps, or via explicit learning of Combo programs via MOSES or hillclimbing. Procedures are represented as SchemaNodes or Schema Maps. HOUSE_OVERSIGHT_013254
Glossary of Specialized Terms 339 Procedure Evaluation/Execution: A general term encompassing both Schema Execu- tion and Predicate Evaluation, both of which are similar computational processes involving manipulation of Combo trees associated with ProcedureNodes. Procedure Learning: Learning of procedural knowledge, based on any method, e.g. evo- lutionary learning (e.g. MOSES), inference (e.g. PLN), reinforcement learning (e.g. Hebbian learning). e Procedure Node: A SchemaNode or PredicateNode e Psi: A model of motivated action and emotion, originated by Dietrich Dorner and further developed by Joscha Bach, who incorporated it in his proto-AGI system MicroPsi. OpenCog- Prime’s motivated-action component, OpenPsi, is roughly based on the Psi model. Psynese: A system enabling different OpenCog instances to communicate without using natural language, via directly exchanging Atom subgraphs, using a special system to map references in the speaker’s mind into matching references in the listener’s mind. Psynet Model: An early version of the theory of mind underlying CogPrime, referred to in some early writings on the Webmind AI Engine and Novamente Cognition Engine. The concepts underlying the psynet model are still part of the theory underlying CogPrime, but the name has been deprecated as it never really caught on. e Reasoning: See inference e Reduct: A code library, used within MOSES, applying a collection of hand-coded rewrite rules that transform Combo programs into Elegant Normal Form. Region Connection Calculus: A mathematical formalism describing a system of basic operations among spatial regions. Used in CogPrime as part of spatial inference to provide relations and rules to be referenced via PLN and potentially other subsystems. Reinforcement Learning: Learning procedures via experience, in a manner explicitly guided to cause the learning of procedures that will maximize the system’s expected future reward. CogPrime does this implicitly whenever it tries to learn procedures that will maxi- mize some Goal whose Truth Value is estimated via an expected reward calculation (where *reward” may mean simply the Truth Value of some Atom defined as ”reward”). Goal-driven learning is more general than reinforcement learning as thus defined; and the learning that CogPrime does, which is only partially goal-driven, is yet more general. RelEx: A software system used in OpenCog as part of natural language comprehension, to map the output of the link parser into more abstract semantic relationships. These more abstract relationships may then be entered directly into the Atomspace, or they may be further abstracted before being entered into the Atomspace, e.g. by RelEx2Frame rules. RelEx2Frame: A system of rules for translating RelEx output into Atoms, based on the FrameNet ontology. The output of the RelEx2Frame rules make use of the FrameNet library of semantic relationships. The current (2012) RelEx2Frame rule-based is problematic and the RelEx2Frame system is deprecated as a result, in favor of Link2Atom. However, the ideas embodied in these rules may be useful; if cleaned up the rules might profitably be ported into the Atomspace as ImplicationLinks. Representation Building: A stage within MOSES, wherein a candidate Combo program tree (within a deme) is modified by replacing one or more tree nodes with alternative tree nodes, thus obtaining a new, different candidate program within that deme. This process currently relies on hand-coded knowledge regarding which types of tree nodes a given tree node should be experimentally replaced with (e.g. an AND node might sensibly be replaced with an OR node, but not so sensibly replaced with a node representing a "kick” action). HOUSE_OVERSIGHT_013255
340 A Glossary e Request for Services (RFS): In CogPrime’s Goal-driven action system, a RFS is a package sent from a Goal Atom to another Atom, offering it a certain amount of STI currency if it is able to deliver the goal what it wants (an increase in its Truth Value). RFS’s may be passed on, e.g. from goals to subgoals to sub-subgoals, but eventually an RFS reaches a Grounded SchemaNode, and when the corresponding Schema is executed, the payment implicit in the RFS is made. e Robot Preschool: An AGI Preschool in our physical world, intended for robotically em- bodied AGIs. e Robotic Embodiment: Using an AGI to control a robot. The AGI may be running on hardware physically contained in the robot, or may run elsewhere and control the robot via networking methods such as wifi. e Scheduler: Part of the CogServer that controls which processes (e.g. which MindAgents) get processor time, at which point in time. e Schema: A “script” describing a process to be carried out. This may be explicit, as in the case of a GroundedSchemaNode, or implicit, as the case in Schema maps or ungrounded SchemaNodes. e Schema Encapsulation: The process of automatically recognizing a Schema Map in an Atomspace, and creating a Combo (or other) program embodying the process carried out by this Schema Map, and then storing this program in the Procedure Repository and associating it with a particular SchemaNode. This translates distributed, global procedural memory into localized procedural memory. It’s a special case of Map Encapsulation. e Schema Execution: The process of "running” a Grounded Schema, similar to running a computer program. Or, phrased alternately: The process of executing the Schema referenced by a Grounded SchemaNode. This may be recursive, as the predicate referenced internally by a Grounded SchemaNode (and represented via a Combo program tree) may itself internally reference other Grounded SchemaNodes. e Schema, Grounded: A Schema that is associated with a specific executable program (either a Combo program or, say, C++ code) e Schema Map: A collection of Atoms, including SchemaNodes, that tend to be enacted in a certain order (or set of orders), thus habitually enacting the same process. This is a distributed, globalized way of storing and enacting procedures. e Schema, Ungrounded: A Schema that represents an abstract procedure, not associated with any particular executable program. e Schematic Implication: A general, conceptual name for implications of the form ((Con- text AND Procedure) IMPLIES Goal) e SegSim: A name for the main algorithm underlying the NLGen language generation soft- ware. The algorithm is based on segmenting a collection of Atoms into small parts, and matching each part against memory to find, for each part, cases where similar Atom- collections already have known linguistic expression. e Self-Modification: A term generally used for AI systems that can purposefully modify their core algorithms and representations. Formally and crisply distinguishing this sort of *strong self-modification” from “mere” learning is a tricky matter. e Sensorimotor: Pertaining to sensory data, motoric actions, and their combination and intersection. e Sensory: Pertaining to data received by the AGI system from the outside world. In a CogPrime system that perceives language directly as text, the textual input will generally HOUSE_OVERSIGHT_013256
Glossary of Specialized Terms 341 not be considered as “sensory” (on the other hand, speech audio data would be considered as "sensory”). Short Term Importance: A value associated with each Atom, indicating roughly the expected utility to the system of keeping that Atom in RAM rather than saving it to disk or deleting it. It’s possible to have multple LTI values pertaining to different time scales, but so far practical implementation and most theory has centered on the option of a single LTI value. Similarity: a link type indicating the probabilistic similarity between two different Atoms. Generically this is a combination of Intensional Similarity (similarity of properties) and Extensional Similarity (similarity of members). Simple Truth Value: a TruthValue involving a pair (s,d) indicating strength (e.g. proba- bility or fuzzy set membership) and confidence d. d may be replaced by other options such as a count n or a weight of evidence w. e Simulation World: See Internal Simulation World e SMEPH (Self-Modifying Evolving Probabilistic Hypergraphs): a style of modeling systems, in which each system is associated with a derived hypergraph SMEPH Edge: A link ina SMEPH derived hypergraph, indicating an empirically observed relationship (e.g. inheritance or similarity) between two SMEPH Vertex: A node in a SMEPH derived hypergraph representing a system, indicat- ing a collection of system states empirically observed to arise in conjunction with the same external stimuli Spatial Inference: PLN reasoning including Atoms that explicitly reference spatial rela- tionships Spatiotemporal Inference: PLN reasoning including Atoms that explicitly reference spa- tial and temporal relationships e STI: Shorthand for Short Term Importance e Strength: The main component of a TruthValue object, lying in the interval [0,1], refer- ring either to a probability (in cases like InheritanceLink, SimilarityLink, EquivalenceLink, ImplicationLink, etc.) or a fuzzy value (as in MemberLink, EvaluationLink). Strong Self-Modification: This is generally used as synonymous with Self-Modification, in a CogPrime context. Subsymbolic: Involving processing of data using elements that have no correspondence to natural language terms, nor abstract concepts; and that are not naturally interpreted as symbolically “standing for’ other things. Often used to refer to processes such as perception processing or motor control, which are concerned with entities like pixels or commands like “rotate servomotor 15 by 10 degrees theta and 55 degrees phi.” The distinction between “symbolic” and "subsymbolic” is conventional in the history of AI, but seems difficult to formalize rigorously. Logic-based AI systems are typically considered “symbolic”, yet Supercompilation: A technique for program optimization, which globally rewrites a pro- gram into a usually very different looking program that does the same thing. A prototype supercompiler was applied to Combo programs with successful results. Surface Realization: The process of taking a collection of Atoms and transforming them into a series of words in a (usually natural) language. A stage in the overall process of language generation. Symbol Grounding: The mapping of a symbolic term into perceptual or motoric entities that help define the meaning of the symbolic term. For instance, the concept Cat” may be HOUSE_OVERSIGHT_013257
342 A Glossary grounded by images of cats, experiences of interactions with cats, imaginations of being a cat, etc. e Symbolic: Pertaining to the formation or manipulation of symbols, i.e. mental entities that are explicitly constructed to represent other entities. Often contrasted with subsymbolic. e Syntax-Semantics Correlation: In the context of MOSES and program learning more broadly, this refers to the property via which distance in syntactic space (distance between the syntactic structure of programs, e.g. if they’re represented as program trees) and se- mantic space (distance between the behaviors of programs, e.g. if they’re represented as sets of input/output pairs) are reasonably well correlated. This can often happen among sets of programs that are not too widely dispersed in program space. The Reduct library is used to place Combo programs in Elegant Normal Form, which increases the level of syntax-semantics corellation between them. The programs in a single MOSES deme are often closely enough clustered together that they have reasonably high syntax-semantics correlation. e System Activity Table: An OpenCog component that records information regarding what a system did in the past. e Temporal Inference: Reasoning that heavily involves Atoms representing temporal in- formation, e.g. information about the duration of events, or their temporal relationship (before, after, during, beginning, ending). As implemented in CogPrime, makes use of an uncertain version of Allen Interval Algebra. e Truth Value: A package of information associated with an Atom, indicating its degree of truth. SimpleTruthValue and IndefiniteTruthValue are two common, particular kinds. Multiple truth values associated with the same Atom from different perspectives may be grouped into CompositeTruthValue objects. e Universal Intelligence: A technical term introduced by Shane Legg and Marcus Hutter, describing (roughly speaking) the average capability of a system to carry out computable goals in computable environments, where goal/environment pairs are weighted via the length of the shortest program for computing them. e Urge: In OpenPsi, an Urge develops when a Demand deviates from its target range. e Very Long Term Importance (VLTI): A bit associated with Atoms, which determines whether, when an Atom is forgotten (removed from RAM), it is saved to disk (frozen) or simply deleted. e Virtual AGI Preschool: A virtual world intended for AGI teaching/training/learning, bearing broad resemblance to the preschool environments used for young humans. e Virtual Embodiment: Using an AGI to control an agent living in a virtual world or game world, typically (but not necessarily) a 3D world with broad similarity to the everyday human world. e Webmind AI Engine: A predecessor to the Novamente Cognition Engine and OpenCog, developed 1997-2001 — with many similar concepts (and also some different ones) but quite different algorithms and software architecture HOUSE_OVERSIGHT_013258
References 343 References AABLO2. Nancy Alvarado, Sam 8S. Adams, Steve Burbeck, and Craig Latta. Beyond the turing test: Per- formance metrics for evaluating a computer simulation of the human mind. Development and Learning, International Conf. on, 0, 2002. AGBDt08. Derek Abbott, Julio Gea-Banacloche, Paul C W Davies, Stuart Hameroff, Anton Zeilinger, Jens Eisert, Howard M. Wiseman, Sergey M. Bezrukov, and Hans Frauenfelder. Plenary debate: quan- tum effects in biology?trivial or not? Fluctuation and Noise Letters 8(1), pp. C5DC26, 2008. ALO3. J. R. Anderson and C. Lebiere. The newell test for a theory of cognition. Behavioral and Brain Science, 26, 2003. ALO9. Itamar Arel and Scott Livingston. Beyond the turing test. [EEE Computer, 42(3):90-91, March 2009. AMO1. J. 5S. Albus and A. M. Meystel. Engineering of Mind: An Introduction to the Science of Intelligent Systems. Wiley and Sons, 2001. Ami89. Daniel J. Amit. Modeling brain function — the world of attractor neural networks. Cambridge University Press, New York, USA, 1989. ARCO9. I. Arel, D. Rose, and R. Coop. Destin: A scalable deep learning architecture with application to high-dimensional robust pattern recognition. Proc. AAAI Workshop on Biologically Inspired Cognitive Architectures, 2009. ARK0O9a. I. Arel, D. Rose, and T. Karnowski. A deep learning architecture comprising homogeneous cortical circuits for scalable spatiotemporal pattern inference. NIPS 2009 Workshop on Deep Learning for Speech Recognition and Related Applications, 2009. ArkO9b. Ronald Arkin. Governing Lethal Behavior in Autonomous Robots. Chapman and Hall, 2009. Arl75. P. K. Arlin. Cognitive development in adulthood: A fifth stage?, volume 11. Developmental Psychology, 1975. Arm04. J. Andrew Armour. Cardiac neuronal hierarchy in health and disease. Am J Physiol Regul Integr Comp Physiol 287:, 2004. Baa97. Bernard Baars. In the Theater of Consciousness: The Workspace of the Mind. Oxford University Press, 1997. Bac09. Joscha Bach. Principles of Synthetic Intelligence. Oxford University Press, 2009. Bar02. Albert-Laszlo Barabasi. Linked: The New Science of Networks. Perseus, 2002. Bat79. Gregory Bateson. Mind and Nature: A Necessary Unity. New York: Ballantine, 1979. Bog4. S. Baron-Cohen. Mindblindness: An Essay on Autism and Theory of Mind. MIT Press, 1994. BDL93. Louise Barrett, Robin Dunbar, and John Lycett. Human Evolutionary Psychology. Princeton University Press, 1993. BDS03. S Ben-David and R Schuller. Exploiting task relatedness for learning multiple tasks. Proceedings of the 16th Annual Conference on Learning Theory, 2003. BF71. J. D. Bransford and J. Franks. The abstraction of linguistic ideas. Cognitive Psychology, 2:331—-350, 1971. BFO9. Bernard Baars and Stan Franklin. Consciousness is computational: The lida model of global workspace theory. International Journal of Machine Consciousness., 2009. bGBKO2. 1. Goertzel, Andrei Klimov Ben, and Arkady Klimov. Supercompiling java programs, 2002. BHO5. Sebastian Bader and Pascal Hitzler. Dimensions of neural-symbolic integration - a structured survey. In 8S. Artemov, H. Barringer, A. 5. d’Avila Garcez, L. C. Lamb, and J. Woods., edi- tors, We Will Show Them: Essays in Honour of Dov Gabbay, volume 1, pages 167-194. College Publications, 2005. Biol. M-m Bi, G-q andPoo. Synaptic modifications by correlated activity: Hebb’s postulate revisited. Ann Rev Neurosci ; 24:139-166, 2001. Bic88. M. Bickhard. Piaget on variation and selection models: Structuralism, logical necessity, and in- teractivism. Human Development, 31:274-312, 1988. Bild5. Philip Bille. A survey on tree edit distance and related problems. Theoretical Computer Science, 337:2005, 2005. Boog. A. Baranes and Pierre-Yves Oudeyer. R-iac: Robust intrinsically motivated active learning. Proc. of the IEEE International Conf. on Learning and Development, Shanghai, China., 33, 2009. Bol98. B. Bollobas. Modern Graph Theory. Springer, 1998. HOUSE_OVERSIGHT_013259
344 Bos02. Bos03. Bro84. BS04. Buc03. Bur62. BwWss. BZO03. BZGS06. Cal96. Car85. Car97. Cas85. Cas04. Cas07. CBOO. CBO6. CMO07. CP05. CRK82. C890. Cse06. CS5G07. CTSt 98. Dam00. Dav84. DCO2. Den87. Den91. DGO5. DOPO8. D6r02. EBJ+ 97. A Glossary Nick Bostrom. Existential risks. Journal of Evolution and Technology, 9, 2002. Nick Bostrom. Ethical issues in advanced artificial intelligence. In Iva Smit, editor, Cognitive, Emotive and Ethical Aspects of Decision Making in Humans and in Artificial Intelligence, vol- ume 2., pages 12-17. 2003. J. Broughton. Not beyond formal operations, but beyond piaget. In M. Commons, F. Richards, and C. Armon, editors, Beyond Formal Operations: Late Adolescent and Adult Cognitive Development, pages 395-411. Praeger. New York, 1984. B. Bakker and Juergen Schmidhuber. Hierarchical reinforcement learning based on subgoal dis- covery and subpolicy specialization. Proc. of the 8-th Conf. on Intelligent Autonomous Systems, 2004. Mark Buchanan. Small World: Uncovering Nature’s Hidden Networks. Phoenix, 2003. C MacFarlane Burnet. The Integrity of the Body. Harvard University Press, 1962. R. W. Byrne and A. Whiten. Machiavellian Intelligence. Clarendon Press, 1988. Selmer Bringsjord and M Zenzen. Superminds: People Harness Hypercomputation, and More. Kluwer, 2003. B. Bakker, V. Zhumatiy, G. Gruener, and Juergen Schmidhuber. Quasi-online reinforcement learning for robots. Proc. of the International Conf. on Robotics and Automation, 2006. William Calvin. The Cerebral Code. MIT Press, 1996. S. Carey. Conceptual Change in Childhood. MIT Press, 1985. R Caruana. Multitask learning. Machine Learning, 1997. R. Case. Intellectual development: Birth to adulthood. Academic Press, 1985. N. L. Cassimatis. Grammatical processing using the mechanisms of physical inferences. In Pro- ceedings of the Twentieth-Sixth Annual Conference of the Cognitive Science Society. 2004. Nick Cassimatis. Adaptive algorithmic hybrids for human-level artificial intelligence. 2007. W.H. Calvin and D. Bickerton. Lingua ex Machina. MIT Press, 2000. Rory Conolly and Jerry Blancato. Computational modeling of the liver. NCCT BOSC Review, 2006. http://www.epa.gov/nect /bosc_review/2006/files/07_Conolly_ Liver_Model.pdf. Jie-Qi Chen and Gillian McNamee. What is Waldorf Education? Bridging: Assessment for Teach- ing and Learning in Early Childhood Classrooms, 2007. L. Commons and A. Pekker. Hierarchical complexity: A formal theory. HEEB : //waw.dareassociation.org/Papers/Hierarchical%20Complexity%20-%20A% 20Formal$20Theory%20 (Commons%20&%20Pekker) .pdf, 2005. . Commons, F. Richards, and D. Kuhn. Systematic and metasystematic reasoning: a case for a evel of reasoning beyond Piaget’s formal operations. Child Development, 53.:1058-1069, 1982. A. G. Cairns-Smith. Seven Clues to the Origin of Life: A Scientific Detective Story. Cambridge University Press, 1990. Peter Csermely. Weak Links: Stabilizers of Complex Systems from Proteins to Social Networks. Springer, 2006. Subhojit Chakraborty, Anders Sandberg, and Susan A Greenfield. Differential dynamics of tran- sient neuronal assemblies in visual compared to auditory cortex. Experimental Brain Research, 1432-1106, 2007. M. Commons, E. J. Trudeau, S. A. Stein, F. A. Richards, and S. R. Krause. Hierarchical complexity of tasks shows the existence of developmental stages. Developmental Review. 18, 18.:237-278, 1998. Antonio Damasio. The Feeling of What Happens. Harvest Books, 2000. D. Davidson. Inquiries into Truth and Interpretation. Oxford: Oxford University Press, 1984. Roberts P D and Bell C C. Spike-timing dependent synaptic plasticity in biological systems. Biological Cybernetics, 87, 392-403, 2002. D. Dennett. The Intentional Stance. Cambridge, MA: MIT Press, 1987. Daniel Dennett. Consciousness Explained. Back Bay, 1991. Hugo De Garis. The Artilect War. ETC, 2005. Wlodzislaw Duch, Richard Oentaryo, and Michel Pasquier. Cognitive architectures: Where do we go from here? Proc. of the Second Conf. on AGI, 2008. Dietrich Dérner. Die Mechantk des Seelenwagens. Eine neuronale Theorie der Handlungsregula- tion. Verlag Hans Huber, 2002. J. Elman, E. Bates, M. Johnson, A. Karmiloff-Smith, D. Parisi, and K. Plunkett. Rethinking Innateness: A Connectionist Perspective on Development. MIT Press, 1997. HOUSE_OVERSIGHT_013260
References 345 Ede93. Gerald Edelman. Neural darwinism: Selection and reentrant signaling in higher brain function. Neuron, 10, 1993. Elm91. J. Elman. Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 7:195-226, 1991. EMC12. Effective-Mind-Control.com. Cellular memory in organ _ transplants. Eeffec- tive Mind Control,, 2012. http://www.effective-mind-control.com/ cellular-memory-in-organ-transplants.html, updated Feb 1 2012. Esoo. G. Engelbretsen and F. Sommers. An invitation to formal reasoning. The Logic of Terms. Alder- shot: Ashgate, 2000. FBO8. Stan Franklin and Bernard Baars. Possible neural correlates of cognitive processes and modules from the lida model of cognition. Cognitive Computing Research Group, University of Memphis, 2008. http://ccrg.cs.memphis.edu/tutorial/correlates.html. FC86. R. Fung and C. Chong. Metaprobability and Dempster-shafer in evidential reasoning. In L. Kanal and J. Lemmer. North-Holland, editors, Uncertainty in Artificial Intelligence, pages 295-302. 1986. Fis80. K. Fischer. A theory of cognitive development: control and construction of hierarchies of skills. Psychological Review, 87:477-531, 1980. Fis01. Jefferson M. Fish. Race and Intelligence: Separating Science From Myth. Routledge, 2001. Fod94. J. Fodor. The Elm and the Expert. Cambridge, MA: Bradford Books, 1994. FP86. Doyne Farmer and Alan Perelson. The immune system, adaptation and machine learning. Physica D, v. 2, 1986. Fra06. Stan Franklin. The lida architecture: Adding new modes of learning to an intelligent, autonomous, software agent. Int. Conf. on Integrated Design and Process Technology, 2006. Fre90. R. French. Subcognition and the limits of the turing test’. Mind, 1990. Fre95. Walter Freeman. Societies of Brains. Erlbaum, 1995. FT02. G. Fauconnier and M. Turner. The Way We Think: Conceptual Blending and the Mind’s Hidden Complezities. Basic, 2002. Gar99. H Gardner. Intelligence reframed: Multiple intelligences for the 21st century. Basic, 1999. GDO9. Ben Goertzel and Deborah Duong. Opencog ns: An extensible, integrative architecture for intel- ligent humanoid robotics. 2009. GdGos. Ben Goertzel and Hugo de Garis. Xia-man: An extensible, integrative architecture for intelligent humanoid robotics. pages 86-90, 2008. GE86. R. Gelman and E. Meck and s. Merkin (1986). Young children’s numerical competence. Cognitive Development, 1:1-29, 1986. GEAO8. Ben Goertzel and Cassio Pennachin Et Al. An integrative methodology for teaching embodied non-linguistic agents, applied to virtual animals in second life. In Proc.of the First Conf. on AGI. IOS Press, 2008. Ger99. Michael Gershon. The Second Brain. Harper, 1999. GGCr11. Ben Goertzel, Nil Geisweiller, Lucio Coelho, Predrag Janicic, and Cassio Pennachin. Real World Reasoning. Atlantis, 2011. GGKO02. T. Gilovich, D. Griffin, and D. Kahneman. Heuristics and biases: The psychology of intuitive judgment. Cambridge University Press, 2002. Gib77. J. J. Gibson. The theory of affordances. In R. Shaw & J. Bransford. Erlbaum, editor, Perceiving, Acting and Knowing. 1977. Gib78. John Gibbs. Kohlberg’s moral stage theory: a Piagetian revision. Human Development, 22:89-112, 1978. Gib79. J. J. Gibson. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin, 1979. GIGHO8. B. Goertzel, M. Ikle, I. Goertzel, and A. Heljakka. Probabilistic Logic Networks. Springer, 2008. Gil82. Carol Gilligan. In a Different Voice. Cambridge, MA: Harvard University Press, 1982. GMITHO08. B. Goertzel, I. Goertzel M. Iklé, and A. Heljakka. Probabilistic Logic Networks. Springer, 2008. Goe93a. Ben Goertzel. The Evolving Mind. Plenum, 1993. Goe93b. Ben Goertzel. The Structure of Intelligence. Springer, 1993. Goe94. Ben Goertzel. Chaotic Logic. Plenum, 1994. Goe97. Ben Goertzel. From Complexity to Creativity. Plenum Press, 1997. Goe0 1. Ben Goertzel. Creating Internet Intelligence. Plenum Press, 2001. Goe(6a. Ben Goertzel. The Hidden Pattern. Brown Walker, 2006. Goe06b. Ben Goertzel. The Hidden Pattern. Brown Walker, 2006. HOUSE_OVERSIGHT_013261
346 A Glossary Goe08. Ben Goertzel. A pragmatic path toward endowing virtually-embodied ais with human-level lin- guistic capability. IEEE World Congress on Computational Intelligence (WCCI), 2008. Goe09a. Ben Goertzel. Cognitive synergy: A universal principle of feasible general intelligence? In [CCT 2009, Hong Kong, 2009. Goe09b. Ben Goertzel. The embodied communication prior. In Proceedings of ICCI-09, Hong Kong, 2009. Goe09c. Ben Goertzel. Opencog prime: A cognitive synergy based architecture for embodied artificial general intelligence. In [CCI 2009, Hong Kong, 2009. GoelOa. Ben Goertzel. Coherent aggregated volition. Multiverse According to Ben, 2010. http://multiverseaccordingtoben.blogspot.com/2010/03/ coherent-aggregated-volition-toward.htm. Goel0b. Ben Goertzel. Opencogprime wikibook. 2010. http: //wiki.opencog.org/w/ OpenCogPrime:WikiBook. Goel0c. Ben Goertzel. Toward a formal definition of real-world general intelligence. 2010. Goel0d. Ben et al Goertzel. A general intelligence oriented architecture for embodied natural language processing. In Proc. of the Third Conf. on Artificial General Intelligence (AGI-10). Atlantis Press, 2010. Goo86. I. Good. The Estimation of Probabilities. Cambridge, MA: MIT Press, 1986. Gor86. R. Gordon. Folk psychology as simulation. Mind and Language. 1, 1.:158-171, 1986. GPC? 11. Ben Goertzel, Joel Pitt, Zhenhua Cai, Jared Wigmore, Deheng Huang, Nil Geisweiller, Ruiting Lian, and Gino Yu. Integrative general intelligence for controlling game ai in a minecraft-like environment. In Proc. of BICA 2011, 2011. GPIt 10. Ben Goertzel, Joel Pitt, Matthew Ikle, Cassio Pennachin, and Rui Liu. Glocal memory: a design principle for artificial brains and minds. Neurocomputing, April 2010. GPPGO06. Ben Goertzel, Hugo Pinto, Cassio Pennachin, and Izabela Freire Goertzel. Using dependency pars- ing and probabilistic inference to extract relationships between genes, proteins and malignancies implicit among multiple biomedical research abstracts. In Proc. of Bio-NLP 2006, 2006. GPSLO3. Ben Goertzel, Cassio Pennachin, Andre’ Senna, and Moshe Looks. An integrative architecture for artificial general intelligence. In Proceedings of IJCAI 2003, Acapulco, 2003. Gre 1. Susan Greenfield. The Private Life of the Brain. Wiley, 2001. GRM111. Erik M. Gauger, Elisabeth Rieper, John J. L. Morton, Simon C. Benjamin, and Vlatko Vedral. Sustained quantum coherence and entanglement in the avian compass. Physics Review Letters, vol. 106, no. 4, 2011. HAGO7. Markert H, Knoblauch A, and Palm G. Modelling of syntactical processing in the cortex. Biosys- tems May-Jun; 89(1-3): 300-15, 2007. Ham87. Stuart Hameroff. Ultimate Computing. North Holland, 1987. Ham10. Stuart Hameroff. The Oconscious pilotONdendritic synchrony moves through the brain to mediate consciousness. Journal of Biological Physics, 2010. Hay85. Patrick Hayes. The second naive physics manifesto. In R. Shaw & J. Bransford, editor, Formal Theories of the Commonsense World. 1985. HBO6. Jeff Hawkins and Sandra Blakeslee. On Intelligence. Brown Walker, 2006. Heb49. Donald Hebb. The organization of behavior. Wiley, 1949. Hey07. F. Heylighen. The Global Superorganism: an evolutionary-cybernetic model of the emerging net- work soctety. Social Evolution and History 6-1, 2007. HF95. P. Hayes and K. Ford. Turing test considered harmful. [/CAI-14, 1995. HGO8. David Hart and Ben Goertzel. Opencog: A software framework for integrative artificial general intelligence. In AGI, volume 171 of Frontiers in Artificial Intelligence and Applications, pages 468-472. IOS Press, 2008. HHPO12. Adam Hampshire, Roger Highfield, Beth Parkin, and Adrian Owen. Fractionating human intelli- gence. Neuron vol. 76 issue 6, 2012. Hib02. Bill Hibbard. Superintelligent Machines. Springer, 2002. Hof79. Douglas Hofstadter. Godel, Escher, Bach: An Eternal Golden Braid. Basic, 1979. Hof95. Douglas Hofstadter. Fluid Concepts and Creative Analogies. Basic Books, 1995. Hof96. Douglas Hofstadter. Metamagical Themas. Basic Books, 1996. Hop82. J J Hopfield. Neural networks and physical systems with emergent collective computational abil- ities. Proc. of the National Academy of Sciences, 79:2554—2558, 1982. HOTO6. G. E. Hinton, 5. Osindero, and Y. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554, 2006. HOUSE_OVERSIGHT_013262
References 347 Hut95. E. Hutchins. Cognition in the Wild. MIT Press, 1995. Hut 96. Edwin Hutchins. Cognition in the Wild. MIT Press, 1996. Hut05. Marcus Hutter. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Prob- ability. Springer, 2005. HZT+ 02. J. Han, S. Zeng, K. Tham, M. Badgero, and J. Weng. Dav: A humanoid robot platform for autonomous mental development,. Proc. 2nd International Conf. on Development and Learning, 2002. IP58. B. Inhelder and J. Piaget. The Growth of Logical Thinking from Childhood to Adolescence. Basic Books, 1958. JLO8. D. J. Jilk and C. Lebiere. and o’reilly. R. C. and Anderson, J. R. (2008). SAL: An explicitly pluralistic cognitive architecture. Journal of Experimental and Theoretical Artificial Intelligence, 20:197—218, 2008. JMo9. Daniel Jurafsky and James Martin. Speech and Language Processing. Pearson Prentice Hall, 2009. Joy00. Bill Joy. Why the future doesn’t need us, Wired. April 2000. Kam91. George Kampis. Self-Modifying Systems in Biology and Cognitive Science. Plenum Press, 1991. Kané64. Immanuel Kant. Groundwork of the Metaphysic of Morals. Harper and Row, 1964. Kap08. F. Kaplan. Neurorobotics: an experimental science of embodiment. Frontiers in Neuroscience, 2008. KEO6. J. L. Krichmar and G. M. Edelman. Principles underlying the construction of brain-based devices. In T. Kovacs and J. A. R. Marshall, editors, Adaptation in Artificial and Biological Systems, pages 37-42. 2006. KK90. K. Kitchener and P. King. Reflective judgement: ten years of research. In M. Commons. Praeger. New York, editor, Beyond Formal Operations: Models and Methods in the Study of Adolescent and Adult Thought, volume 2, pages 63-78. 1990. KLH83. Lawrence Kohlberg, Charles Levine, and Alexandra Hewer. Moral stages : a current formulation and a response to critics. Karger. Basel, 1983. Koh3s. Wolfgang Kohler. The Place of Value in a World of Facts. Liveright Press, New York, 1938. Koh81. Lawrence Kohlberg. Essays on Moral Development, volume I. The Philosophy of Moral Develop- ment, 1981. Ks04. Adam Kahane and Peter Senge. Solving Tough Problems: An Open Way of Talking, Listening, and Creating New Realities. Berrett-Koehler, 2004. Kur06. Ray Kurzweil. The Singularity is Near. 2006. Kurl2. Ray Kurzweil. How to Create a Mind. Viking, 2012. Kyb97. H. Kyburg. Bayesian and non-bayesian evidential updating. Artificial Intelligence, 31:271—293, 1997. Lan05. Pat Langley. An adaptive architecture for physical agents. Proc. of the 2005 IEEE/WIC/ACM Int. Conf. on Intelligent Agent Technology, 2005. LAon. C. Lebiere and J. R. Anderson. The case for a hybrid architecture of cognition. (in preparation). LBDE90. Y. LeCun, B. Boser, J. 5. Denker, and Al. Et. Handwritten digit recognition with a back- propagation network. Advances in Neural Information Processing Systems, 2, 1990. LDO3. A. Laud and G. Dejong. The influence of reward on the speed of reinforcement learning. Proc. of the 20th International Conf. on Machine Learning, 2003. Leg06a. Shane Legg. Friendly ai is bunk. Vetta Project, 2006. http: //commonsenseatheism.com/ wp-content/uploads/2011/02/Legg-Friendly—-Al-is-bunk. pdf. Leg06b. Shane Legg. Unprovability of friendly ai. Vetta Project, 2006. http: //www.vetta.org/2006/ 09/unprovability-of-friendly-ai/. LG90. Douglas Lenat and R. V. Guha. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, 1990. LH07a. Shane Legg and Marcus Hutter. A collection of definitions of intelligence. IOS, 2007. LHO07b. Shane Legg and Marcus Hutter. A definition of machine intelligence. Minds and Machines, 17, 2007. LLW+05. Guang Li, Zhengguo Lou, Le Wang, Xu Li, and Walter J Freeman. Application of chaotic neural model based on olfactory system on pattern recognition. [CNC, 1:378-381, 2005. LMCO7a. M. H. Lee, Q. Meng, and F. Chao. Developmental learning for autonomous robots. Robotics and Autonomous Systems, 2007. LMCO7b. M. H. Lee, Q. Meng, and F. Chao. Staged competence learning in developmental robotics. Adaptive Behavior, 2007. HOUSE_OVERSIGHT_013263
348 A Glossary LNOO. George Lakoff and Rafael Nunez. Where Mathematics Comes From. Basic Books, 2000. Log07. Robert M. Logan. The Extended Mind. University of Toronto Press, 2007. Loo06. Moshe Looks. Competent Program Evolution. PhD Thesis, Computer Science Department, Wash- ington University, 2006. LRN87. John Laird, Paul Rosenbloom, and Alan Newell. Soar: An architecture for general intelligence. Artificial Intelligence, 33, 1987. LS05. J Lisman and N Spruston. Postsynaptic depolarization requirements for ltp and ltd: a critique of spike timing-dependent plasticity. Nature Neuroscience 8, 839-41, 2005. LWMLO9. John Laird, Robert Wray, Robert Marinier, and Pat Langley. Claims and challenges in evaluating human-level intelligent systems. Proc. of AGI-09, 2009. ac95. D. MacKenzie. The automation of proof: A historical and sociological exploration. [EEE Annals of the History of Computing, 17(3):7-29, 1995. arQl. H. Marchand. Reflections on PostFormal Thought. The Genetic Epistemologist, 2001. cK03. Bill McKibben. Enough: Staying Human in an Engineered Age. Saint Martins Griffin, 2003. et04. Thomas Metzinger. Being No One. Bradford, 2004. in88. Marvin Minsky. The Society of Mind. MIT Press, 1988. inO7. Marvin Minsky. The Emotion Machine. 2007. KO7. Joseph Modayil and Benjamin Kuipers. Autonomous development of a grounded object ontology by a learning robot. AAAI-07, 2007. KO08. Jonathan Mugan and Benjamin Kuipers. Towards the application of reinforcement learning to undirected developmental learning. International Conf. on Epigenetic Robotics, 2008. Ko9. Jonathan Mugan and Benjamin Kuipers. Autonomously learning an action hierarchy using a learned qualitative state representation. [JCAI-09, 2009. onl2. Maria Montessori. The Montessori Method. Frederick A. Stokes, 1912. SVT 08. G. Metta, G. Sandini, D. Vernon, L. Natale, and F. Nori. The icub humanoid robot: an open plat- form for research in embodied cognition. Performance Metrics for Intelligent Systems Workshop (PerMIS 2008), 2008. W07. Stephen Morgan and Christopher Winship. Counterfactuals and Causal Inference. Cambridge University Press, 2007. Nan08. Nanowerk. Carbon nanotube rubber could provide e-skin for robots. http: //www.nanowerk. com/news/newsid=6717.php, 2008. Nei98. Dianne Miller Neilsen. Teaching Young Children, Preschool-K: A Guide to Planning Your Cur- riculum, Teaching Through Learning Centers, and Just About Everything Else. Corwin Press, 1998. New90. Alan Newell. Unified Theories of Cognition. Harvard University press, 1990. Nie98. Dianne Miller Nielsen. Teaching Young Children, Preschool-K: A Guide to Planning Your Cur- riculum, Teaching Through Learning Centers, and Just About Everything Else. Corwin Press, 1998. Nilo9. Nils Nilsson. The physical symbol system hypothesis: Status and prospects. 50 Years of AI, Festschrift, LNAI 4850, 33, 2009. NK04. A. Nestor and B. Kokinov. Towards active vision in the dual cognitive architecture. International Journal on Information Theories and Applications, 11, 2004. OKO06. P. Oudeyer and F. Kaplan. Discovering communication. Connection Science, 2006. Omo08. Stephen Omohundro. The basic ai drives. Proceedings of the First AGI Conference. IOS Press, 2008. Omo09. Stephen Omohundro. Creating a cooperative future. 2009. http: //selfawaresystems.com/ 2009/02/23/talk-—on-creating-a-cooperative-future/. Opab52. A. 1. Oparin. The Origin of Life. Dover, 1952. Pal82. Gunter Palm. Neural Assemblies. An Alternative Approach to Artificial Intelligence. Springer, 1982. Pei34. C, Peirce. Collected papers: Volume V. Pragmatism and pragmaticism. Harvard University Press. Cambridge MA., 1934. Pel05. Martin Pelikan. Hierarchical Bayesian Optimization Algorithm: Toward a New Generation of Evolutionary Algorithms. Springer, 2005. Pen96. Roger Penrose. Shadows of the Mind. Oxford University Press, 1996. Per70. William G. Perry. Forms of Intellectual and Ethical Development in the College Years: A Scheme. Holt, Rinehart and Winston, 1970. HOUSE_OVERSIGHT_013264
References 349 Per81. William G. Perry. Cognitive and ethical growth: The making of meaning. In Arthur W. Chickering. Jossey-Bass. San Francisco, editor, The Modern American College, pages 76-116. 1981. PH12. Zhiping Pang and Weiping Han. Regulation of synaptic functions in central nervous system by endocrine hormones and the maintenance of energy homeostasis. Bioscience Reports, 2012. Pia53. Jean Piaget. The Origins of Intelligence in Children. Routledge and Kegan Paul, 1953. Pia55. Jean Piaget. The Construction of Reality in the Child. Routledge and Kegan Paul, 1955. Pir84. Robert Pirsig. Zen and the Art of Motorcycle Maintenance. Bantam, 1984. PNRO’7. Karalny Patterson, Peter J. Nestor, and Timothy T. Rogers. Where do you know what you know? the representation of semantic knowledge in the human brain. Nature Reviews Neuroscience, 8:976-987, 2007. PSFO9. Richard Dum Peter Strick and Julie Fiez. Cerebellum and nonmotor function. Annual Review of Neuroscience Vol. 32: 413-4384, 2009. PW78. D. Premack and G. Woodruff. Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, pages 515-526, 1978. QaGKKF05. R. Quian Quiroga, L. Reddy amd G. Kreiman, @. Koch, and I. Fried. Invariant visual represen- tation by single-neurons in the human brain. Nature, 435:1102-1107, 2005. QKKF08. R. Quian Quiroga, G Kreiman, C Koch, and I. Fried. Sparse but not "grandmother-cell” coding in the medial temporal lobe. Trends in Cognitive Sciences, 12:87-91, 2008. Rav04. Jan Ravenscroft. Folk psychology as a theory, stanford encyclopedia of philosophy. http:// plato.stanford.edu/entries/folkpsych-theory/, 2004. RBW92. Gagne R., L. Briggs, and W. Walter. Principles of Instructional Design. Harcourt Brace Jo- vanovich, 1992. RCKO1. J. Rosbe, R. S. Chong, and D. E. Kieras. Modeling with perceptual and memory constraints: An epic-soar model of a simplified enroute air traffic control task. SOAR Technology Inc. Report, 2001. RD0O6. Matthew Richardson and Pedro Domingos. Markov logic networks. Machine Learning, 2006. Rie73. K. Riegel. Dialectic operations: the final phase of cognitive development. Human Development, 16.:346-370, 1973. RM95. H. L. Roediger and K. B. McDermott. Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21:803-814, 1995. Ros88. Israel Rosenfield. The Invention of Memory: A New View of the Brain. Basic Books, 1988. Row90. John Rowan. Subpersonalities: The People Inside Us. Routledge Press, 1990. Rowl11. T Rowe. Fossil evidence on origin of the mammalian brain. Science 20, 2011. RVO1. Alan Robinson and Andrei Voronkov. Handbook of Automated Reasoning. MIT Press, 2001. RZDKO5. Michael Rosenstein, ZvikaMarx, Tom Dietterich, and Leslie Pack Kaelbling. Transfer learning with an ensemble of background tasks. NIPS workshop on inductive transfer, 2005. SA93. L. Shastri and V. Ajjanagadde. From simple associations to systematic reasoning: A connectionist encoding of rules, variables, and dynamic bindings using temporal synchrony. Behavioral & Brain Sciences, 16-3, 1993. Sal93. Stan Salthe. Development and Evolution. MIT Press, 1993. Sam 10. Alexei V. Samsonovich. Toward a unified catalog of implemented cognitive architectures. In BICA, pages 195-244, 2010. SB98. Richard Sutton and Andrew Barto. Reinforcement Learning. MIT Press, 1998. SBO6. J. Simsek and A. Barto. An intrinsic reward mechanism for efficient exploration. Proc. of the Twenty-Third International Conf. on Machine Learning, 2006. SBC05. S. Singh, A. Barto, and N. Chentanez. Intrinsically motivated reinforcement learning. Proc. of Neural Information Processing Systems 17, 2005. SC94. Barry Smith and Roberto Casati. Naive Physics: An Essay in Ontology. Philosophical Psychology, 1994. Sch9la. Juergen Schmidhuber. Curious model-building control systems.. Proc. International Joint Conf. on Neural Networks, 1991. Sch91b. Juergen Schmidhuber. A possibility for implementing curiosity and boredom in model-building neural controllers. Proc. of the International Conf. on Simulation of Adaptive Behavior: From Animals to Animats, 1991. Sch95. Juergen Schmidhuber. Reinforcement-driven information acquisition in non-deterministic envi- ronments. Proc. ICANN’95, 1995. Sch02. Juergen Schmidhuber. Exploring the predictable.. Springer, 2002. HOUSE_OVERSIGHT_013265
350 A Glossary Sch06. J. Schmidhuber. Godel machines: Fully Self-referential Optimal Universal Self-improvers. In B. Goertzel and C. Pennachin, editors, Artificial General Intelligence, pages 119-226. 2006. SchO7. Dale Schunk. Theories of Learning: An Educational Perspective. Prentice Hall, 2007. SEO7. Stuart Shapiro and Al. Et. Metacognition in sneps. AI Magazine, 28, 2007. SF05. Greenfield SA and Collins T F. A neuroscientific approach to consciousness. Prog Brain Res., 2005. Sha76. G. Shafer. A Mathematical Theory of Evidence. Princeton, NJ: Princeton University Press, 1976. Shu03. Thomas R. Shultz. Computational Developmental Psychology. MIT Press, 2003. SKBB91. D Shannahoff-Khalsa, M Boyle, and M Buebel. The effects of unilateral forced nostril breathing on cognition. Int J Neurosct., 1991. Slo01. Aaron Sloman. Varieties of affect and the cogaff architecture schema. In Proceedings of the Symposium on Emotion, Cognition, and Affective Computing, AISB-01, 2001. Slo08a. Aaron Sloman. A new approach to philosophy of mathematics: Design a young explorer able to discover toddler theorems’. 2008. $1o08b. Aaron Sloman. The Well-Designed Young Mathematician. Artificial Intelligence, December 2008. SMO05. Push Singh and Marvin Minsky. An architecture for cognitive diversity. In Darryl Davis, editor, Visions of Mind. 2005. Sot11. Kaj Sotala. 14 objections against ai/friendly ai/the singularity answered. Xuenay.net, 2011. http://www. xuenay.net/objections.html , downloaded 3/20/11. SS74. Jean Sauvy and Simonne Suavy. The Child’s Discovery of Space: From hopscotch to mazes — an introduction to intuitive topology. Penguin, 1974. SS03a. John F. Santore and Stuart C. Shapiro. Crystal cassie: Use of a 3-d gaming environment for a cognitive agent. In Papers of the [J/CAI 2003 Workshop on Cognitive Modeling of Agents and Multi-Agent Interactions, 2003. SS03b. Rudolf Steiner and S K Sagarin. What is Waldorf Education? Steiner Books, 2003. Stc00. Theodore Stcherbatsky. Buddhist Logic. Motilal Banarsidass Pub, 2000. SV99. A. J. Storkey and R. Valabregue. The basins of attraction of a new hopfield learning rule. Neural Networks, 12:869-876, 1999. SZ04, R. Sun and X. Zhang. Top-down versus bottom-up learning in cognitive skill acquisition. Cognitive Systems Research, 5, 2004. TC97. M. Tomasello and J. Call. Primate Cognition. Oxford University Press, 1997. TCO5. Endel Tulving and R. Craik. The Oxford Handbook of Memory. Oxford U. Press, 2005. Tead6. Sebastian Thrun and et al. The robot that won the darpa grand challenge. Journal of Robotic Systems, 23-9, 2006. TM95. S. Thrun and Tom Mitchell. Lifelong robot learning. Robotics and Autonomous Systems, 1995. TS94. E. Thelen and L. Smith. A Dynamic Systems Approach to the Development of Cognition and Action. MIT Press, 1994. TSO7. M. Taylor and P. Stone. Cross-domain transfer for reinforcement learning. Proc. of the 24th International Conf. on Machine Learning, 2007. Tur50. Alan Turing. Computing machinery and intelligence. Mind, 59, 1950. Tur77. Valentin F. Turchin. The Phenomenon of Science. Columbia University Press, 1977. TV96. Turchin and V. Supercompilation: Techniques and results. In Dines Bjorner, M. Broy, and Alek- sandr Vasilevich Zamulin, editors, Perspectives of System Informatics. Springer, 1996. Vin93. Vernor Vinge. The coming technological singularity. VISION-21 Symposium, NASA and Ohio Aerospace Institute, 1993. http://www-rohan.sdsu.edu/faculty/vinge/misc/ singularity.html. Vyg86. Lev Vygotsky. Thought and Language. MIT Press, 1986. WA1O. Wendell Wallach and Colin Atkins. Moral Machines. Oxford University Press, 2010. Wan95. P. Wang. Non-Aziomatic Reasoning System. PhD Thesis, Indiana University. Bloomington, 1995. Wan06. Pei Wang. Rigid Flexibility: The Logic of Intelligence. Springer, 2006. Was09. Mark Waser. Ethics for self-improving machines. In AGI-09, 2009. http: //vimeo.com/ 3698890. Wel90. H. Wellman. The Child’s Theory of Mind. MIT Press, 1990. WHO6. J. Weng and W. 5. Hwangi. From neural networks to the brain: Autonomous mental development. IEEE Computational Intelligence Magazine, 2006. Who64. Benjamin Lee Whorf. Language, Thought and Reality. 1964. HOUSE_OVERSIGHT_013266
References 351 WHZ* 00. J. Weng, W. 5. Hwang, Y. Zhang, C. Yang, and R. Smith. Developmental humanoids: Humanoids that develop skills automatically,. Proc. the first IEEE-RAS International Conf. on Humanoid Robots, 2000. Wik1l. Wikipedia. Open source governance. 2011. http://en.wikipedia.org/wiki/Open_ source_governance. Win72. Terry Winograd. Understanding Natural Language. Edinburgh University Press, 1972. Wit07. David C. Witherington. The Dynamic Systems Approach as Metatheory for Developmental Psy- chology, Human Development. 50, 2007. Wol02. Stephen Wolfram. A New Kind of Science. Wolfram Media, 2002. WwWwoe. Matt Williams and Jon Williamson. Combining argumentation and bayesian nets for breast cancer prognosis. Journal of Logic, Language and Information, 2006. Yud04. Eliezer Yudkowsky. Coherent extrapolated volition. Singularity Institute for Al, 2004. http: //singinst.org/upload/CEV.html. Yud06. Eliezer Yudkowsky. What is friendly ai? Singularity Institute for Af, 2006. http://singinst. org/ourresearch/publications/what-is-friendly-ai.html. Zad78. L. Zadeh. Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1:3-28, 1978. ZPKO?. Luke 5 Zettlemoyer, Hanna M. Pasula, and Leslie Pack Kaelbling. Logical particle filtering. Proceedings of the Dagstuhl Seminar on Probabilistic, Logical, and Relational Learning, 2007. HOUSE_OVERSIGHT_013267











































































































































































































































