Fundamentals of the approach to building universal intelligence. Part 1
From universal intelligence to strong AI. Prospects for creating a strong artificial intelligence
The field of artificial intelligence (AI) has brought a lot of remarkable practical results in terms of automation of human activity in various fields, which gradually changes the face of our civilization. However, the ultimate goal - the creation of truly intelligent machines (strong AI) has not yet been achieved. At the same time, there are few scientists who really doubt that such a strong AI in one form or another can be created. If any objections sound, they have a religious character, appealing to the presence of a non-material soul in a person. But even with such radical views on the non-material world, only such conceptually complex phenomena as free will, creativity or feelings are attributed, without denying the possibility of endowing a machine with almost indistinguishable behavior from a person.
Artificial intelligence as an area has experienced different periods. The initial period, which is often described as romantic, promised the imminent, for a couple of decades, creation of thinking machines. Unjustified expectations led to a more pragmatic attitude, to the orientation of many researchers to weak AI - non-universal intelligent systems that can solve narrow classes of practical problems. The peak of this trend is in expert systems (ES), which promised not machine intelligence, but effective commercial solutions to complex applied problems. However, here expectations were not met. EC, although quite successfully applied in practice, did not become a breakthrough technology that would revolutionize the world’s business, which is why investments flowing into this area decreased markedly [McCarthy, 2005]. In the USA, the “winter of AI” has come.
However, AI research has not faded at all. A large number of subregions that emerged from AI, such as computer vision, text analysis, speech recognition, etc., continued to bear fruit, albeit not sensational, but more and more significant. Business interest in weak AI systems has revived. Words about the extraordinary importance of the field of AI in the future for the whole of humanity have again been repeated [Nilsson, 2005a]. And again the idea was voiced that the field of AI needed to “officially” return its ultimate goal - the creation of truly intelligent machines [Brachman, 2005].
At the same time, however, in purely academic circles, scientists stopped voicing the dates for the possible creation of a strong AI. Nevertheless, a number of prominent specialists in this field are again called dates of several (and sometimes even one) decades [Hall, 2008]. And this time, such expert expectations are supported by independent evidence. One of them is due to the fact that, at least according to some estimates, computing power by a computer comparable to the computing resources of the human brain is achievable by the 2030s (and by some it is achievable now [Hall, 2008]). Given the fact that the lack of computing power was (as is now clear) one of the objective reasons why the early forecasts about the creation of this AI were unrealistic, the probable elimination of this reason in the near future is encouraging.
But computing power is only a prerequisite for creating a strong AI. In addition, there are many substantial problems in the theory of AI that have not yet been resolved. Will they be solved in the coming decades? Some confidence in this is given by forecasts related to technological singularity (see, for example, [Kurzweil, 2005]). The concept of singularity is based on the fact of an accelerating increase in the complexity of technical (and previously biological) systems. Since at each stage of global evolution, the complexity of the systems turns out to be exponential (Moore's law is a particular example), and when passing between the stages, the exponent increases every time, i.e., the doubling time of complexity decreases (for example, the doubling time of DNA capacity is hundreds of millions of years
The extrapolation of the curve of increasing complexity does not allow to attribute the moment of the onset of the singularity later than 2050 (and usually earlier), and the appearance of some superhuman mind should probably be one of the subsequent stages of complicating the systems. Of course, the possibility of achieving a true singularity can be disputed: the graph of increasing complexity is objective, but its extrapolation can be different, but the time intervals to the next stages (metasystem transitions) should not start too suddenly and too long. And, therefore, this concept also confirms the possibility of creating a strong AI over the coming decades, which makes this problem, although it leaves the question of how to approach its solution.
At the same time, leading experts note the impossibility of achieving strong AI in the framework of short-term projects [McCarthy, 2005], by creating highly specialized intelligent systems [Nilsson, 2005b] or even by constantly improving systems that solve isolated cognitive tasks such as teaching or understanding a natural language [Brachman, 2005 ]. It is necessary to set and solve the problem of creating a strong AI, even if it is not expected to receive any commercial results for the first ten or more years.
In the academic environment, everything is limited to a completely natural call for the union of AI subdomains [Bobrow, 2005; Brachman 2005; Cassimatis et al., 2006], each of which has already managed to acquire its own deep specificity. The progress achieved in each of the sub-regions gives hope that combining the results will allow us to build intelligent systems that are significantly more powerful than those that were built at the dawn of the computer age in attempts to create the first thinking machines. On the other hand, such an association should give a lot to the subdomains themselves: after all, the tasks solved within their framework are often considered AI-complete. So, it is hardly possible to create universal systems for pattern recognition, language understanding, or automatic proof of theorems without creating a strong AI,
The use and study of cognitive architectures as ways of combining in a single system all the functions necessary for a full-fledged intelligence, such as teaching, presenting knowledge, reasoning, etc. stood out in a new dominant paradigm in the field of AI in general [Brachman, 2005]. And this paradigm is officially associated with the construction of artificial intelligence systems at the human level [Cassimatis et al., 2006; Cassimatis, 2006], or universal [Langley, 2006].
Such integration studies are necessary, but how sufficient are they? The general ideas that a strong AI should be created as a single system, which should include some basic cognitive functionality, are quite obvious and have been expressed for a very long time. However, until now there is neither the minimum necessary list of cognitive functions, nor, moreover, justified details of their implementation.
Moreover, there are not only many substantially different cognitive architectures [Jones and Wray, 2006], but also architectural paradigms alternative to cognitive [Langley, 2006]. At the same time, cognitive architectures mainly concentrate on issues of integration, the interaction of individual functions. But is it possible to get strong AI from weak cognitive components? In our opinion, the answer is unequivocal: no. Instead of (or at least in addition) discussing the methodological issues of combining existing weak components, it is necessary to develop a theory of strong AI, from which both the structure of strong AI components and the necessary architecture for combining them will simultaneously follow.
As rightly noted in [Cohen 2005]: “Poor performance and universal scope are preferred to good performance and narrow scope.” Given that, as indicated, the creation of effective highly specialized systems almost does not bring us closer to strong AI, it is natural to ask, what is lacking in modern cognitive systems in terms of universality?
Universality as an algorithmic completeness.
Historically, several fundamental areas have emerged in the field of artificial intelligence, such as search or training. These directions begin to be clearly visible when we set intellectual tasks in the most simplified, pure form. So, considering game problems or proofs of theorems, we can offer a universal solution for them - a complete enumeration of options in the space of possible operations. Of course, with finite computing resources, exhaustive search is impossible, but this does not eliminate the concept of search as a fundamental component of intelligence. In the case when the search space is not known in advance, the task of training is set (more precisely, predicting how certain operations will affect the state of the world and the agent itself). Here the universal solution is not so obvious but it is also known almost from the very moment of the origin of the AI field. This is a universal method of prediction of R. Solomonov [Solomonoff, 1964] based on the algorithmic theory of information. This method is also not applicable in practice, since it requires a huge enumeration of options (and, generally speaking, requires solving an algorithmically unsolvable shutdown problem).
These ideal methods are things that need to be approached in the conditions of limited computing resources, since only the resource limit separates these methods from the fact that they can now implement a strong AI. For example, the whole problem of heuristic programming and metaheuristic search methods arose when trying to solve the search problem with limited resources. Also, the problems of machine learning, including, for example, transfer training, concept training and much more, appear due to limited resources. At the same time, however, researchers developing practical methods often do not look back at the ideal to which one must strive. This leads to the creation of weak artificial intelligence methods with an irreparable defect. This defect is that these methods are not Turing complete, that is, they work in a limited space of algorithms and, in principle, cannot go beyond these limitations. Although for different particular methods the domains in the algorithm space may differ from each other, their final union cannot give an algorithmically complete space. In terms of machine learning methods, this means the inability to identify arbitrary regularity that may be in the data, the inability to build a model of the world that was not previously provided by the developer.
Here lies the answer why work in the field of cognitive architectures (as an approach to strong AI) is not sufficient. They proceed from the premise that modern methods of searching in the space of solutions, representing knowledge, machine learning are enough, and only their combination is lacking, in which a new quality arises - a strong AI. However, we believe that the property of the universality of intelligence lies in the fact that, in principle, it can operate with any models from algorithmically complete space (although in practice this, of course, is not fully achieved). In this regard, it is useful to separate the concept of universal and strong AI. Although they can actually mean the same thing, the concept of a strong AI implicitly implies the desire to create models that resemble human intelligence in appearance,
To ensure this, you can start with some idealized model of strong AI working in conditions of infinite resources. Since a truly autonomous artificial intelligence must be created as an embodied intellectual agent, it is necessary to develop an idealized model of such an agent that could hypothetically solve all those tasks that a person can solve.
There are attempts to create such models (the most famous is AIXI [Hutter, 2005]), and we will discuss them later. Now we just note that the consideration of such models leads different researchers to conclude that the universality of the intellect is ensured by algorithmic completeness, and this property must be tried to be maintained at least in the limit (see, for example, [Pankov, 2008]).
Thus, the first methodological principle is to maintain the absence of restrictions on the algorithmic completeness of many models (patterns, concepts, ideas) that can be derived or used by universal AI systems.
Feasibility as a resource constraint.
Generic algorithmic intelligence models can be a good starting point. But it is also clear that resource constraints must be taken into account so that these models can be implemented. Indeed, it is precisely this limitation that largely determines the specifics of our cognitive processes.
Indeed, the models of universal intelligence have almost nothing in common with real intelligence, judging by their “cognitive operations”. Such models will not explicitly build a system of concepts, will not carry out planning, will not have attention, etc. It is extremely difficult to say whether they will have the function of "understanding", self-awareness, etc. Here you can draw a (incomplete) analogy with a chess program, which, due to unlimited resources, carries out a complete search. This program is extremely simple. Its only fundamental operation is search. There is no description of chess positions in any derived terms, there is nothing like understanding. But in the context of chess, she behaves perfectly. Similarly, one can try to imagine the ideal embodied intelligence acting in the real world.
The absence of the main part of cognitive functions in such an ideal intellect can mean one of two things. Or, these functions are a consequence of limited resources (for a number of them, for example, for attention, this is so with obviousness). Either intellect is something completely different from what they usually mean by it (and they mean a means of solving problems, the main of which is survival). Perhaps the second alternative is not so senseless (and not so contradictory to the first one), if intelligence is not any, but some distinguished way of solving problems (that is, if the intellect is important not so much as functionality as how to achieve it). At the same time, with infinite computing resources, reasonable behavior can be achieved by much simpler means. Fortunately, to discuss whether the system should be called reasonable, that implements ideal (in terms of adequacy) behavior due to “crude computing power”, and not due to “intelligence” (a certain structural complexity of the processes of “thinking”), not necessarily due to the hypothetical nature of such a system. The only thing that needs to be discussed is whether this system will really have all the capabilities that natural intelligence has. If there is any doubt about this, then it will be necessary to overcome it, either by substantiating the feasibility of the corresponding capabilities, or by specifying the model. like natural intelligence. If there is any doubt about this, then it will be necessary to overcome it, either by substantiating the feasibility of the corresponding capabilities, or by specifying the model. like natural intelligence. If there is any doubt about this, then it will be necessary to overcome it, either by substantiating the feasibility of the corresponding capabilities, or by specifying the model.
The idea of limited resources as a fundamental property of a strong AI that defines its architecture has already been voiced [Wang, 2007]. But being guided by this idea alone is also not enough, which will be discussed below. Now we just note that taking into account the limited resources should not violate the (algorithmic) universality of intelligence. Relatively speaking, real intelligence is an "any-time" method that strives for perfect intelligence with an unlimited increase in computing resources.
Developers of universal algorithmic intelligence models also agree with the need to introduce resource constraints (see, for example, [Schmidhuber, 2007], [Hutter, 2007]). Attempts to introduce resource constraints in these models can be considered as the second step towards universal AI, although it is difficult to judge how significant this step is: often these models are “too universal” in the sense that the authors try to put a minimal bias in them in which world to function.
Thus, the second methodological principle is to build the architecture of real universal intelligence by introducing resource constraints into the model of ideal universal intelligence.
A priori information about the world as the main content of the phenomenon of intelligence.
Embodied intelligence is limited not only by the number of computational operations performed in solving problems of induction and deduction, but also by the number of actions performed in the physical world. The second type of restrictions is fundamentally not reducible to the first, although there is some relationship between them: the performance of some action can eliminate the need to reason, and, conversely, by thinking, you can reduce the number of trial actions in the physical world. It is this type of constraint that is not taken into account in models of ideal algorithmic intelligence with limited computing resources.
On a global scale, increasing the effectiveness of actions taken is primarily due to the accumulation of information about the outside world. One can imagine a model of ideal intelligence with a minimum of a priori information. This intellect will be able to learn anything (including the efficient use of its computing resources) and in the limit will be as effective as specialized intelligence, but it will take too much time. And, of course, such an intellect cannot autonomously survive in the process of primary education.
Moreover, a priori information for real intelligence can have the most diverse form, in particular, take the form of abilities, such as imitation. Indeed, the ideal intellect must be expected to be able to imitate without having this ability in advance, however, for this he will first have to accumulate too much information. If this ability is available immediately, then it can significantly accelerate the optimization of their own actions in the physical world. It is worth noting that models for training robots by imitation are now widely studied (as well as the study of mirror neurons in neurophysiology). The problem, however, is that this mechanism (like all other additional a priori mechanisms) be consistent with the universality of intelligence. Similarly and linguistic abilities should be a little embedded a priori. This should be done not because universal intelligence, in principle, cannot acquire them on its own, but because this acquisition may take too much time.
The explanation of a number of cognitive abilities as a priori information about the external world (both purely physical and social), which allows to accelerate the development of intelligence (which, in fact, comes down to the accumulation of information and its processing), is quite obvious. However, this explanation was not used to determine the device of universal intelligence. We are interested in the minimum amount of a priori information and the form of its presentation, which will allow real AI to develop no more slowly than a person. The fundamental issue in this case is the embedding of a priori information in the structure of universal AI.
The importance of this point is seen in the example of the flexibility of the architecture of natural intelligence. For example, a person’s brain doesn’t know in advance that linguistic information will be transmitted through speech. In the formation of protoponcepts, mechanisms related to conditioned reflexes work. If the ability to form true concepts is a priori laid, then it is not tied to sensory modality. Such universality must also be left when introducing some a priori elements into the structure of AI. Now, in the models of teaching concepts, not only the separation into semantic and linguistic channels is performed a priori, but a binding to modality is also being made. A similar conclusion can be made regarding the modeling of other cognitive mechanisms that reflect a priori information. The most striking example of this is expert systems,
On the other hand, it is the volume of a priori information necessary for real intelligence and the variety of its forms (this can be information about the most diverse aspects of the outside world, as well as about heuristics for the optimal use of one’s own resources) which makes creating AI so difficult. In this sense, simple models of universal intelligence bring us little closer to its creation. Practically used cognitive architectures could even be more useful if they did not require a complete redo when trying to make them universal. Instead of adding the universality property to existing systems, originally composed of weak components, it will be more productive to start with a universal impractical system, adding to it in a consistent way those heuristics that have been accumulated in the field of classical AI.
The substantial complexity of the intellect, its cognitive architecture, is what allows us to act in the existing world in conditions of limited resources and without excessively long training. But this means that the main complexity of our intellect is related to its optimization for the surrounding world. The structure of such an intellect cannot be theoretically deduced in universal models of intelligence, but must be obtained empirically either by the universal intellect itself or by the developers. Naturally, at the same time we want to make as universal intelligence as possible. More specifically, such an intelligence can be as versatile as the simplest models mentioned are. The difference between them will only be a shift in preferences or bias towards our world. Naturally,
But the loss of universality is inadmissible in this case, since our world itself is a "universal environment." In this regard, with universal “unbiased” models, it is quite possible to begin building a real AI. Heuristics related to the features of our world can gradually be introduced into them, starting with the most general ones, until AI can independently act (including self-optimization) quite effectively.
Thus, the third methodological principle is the introduction of a priori information into universal intelligence to reduce the amount of data that an agent needs to obtain in ontogenesis for autonomous functioning in the real world, provided that the subsequent universal induction and deduction are consistent with a priori information.
Part 2.
Literature.
(McCarthy, 2005) McCarthy J. The Future of AI — A Manifesto // AI Magazine. 2005. V. 26. No 4. P. 39.
(Nilsson, 2005a) Nilsson NJ Reconsiderations // AI Magazine. 2005. V. 26. No 4. P. 36–38.
(Nilsson, 2005b) Nilsson NJ Human-Level Artificial Intelligence? Be Serious! // AI Magazine. 2005. V. 26. No. 4. P. 68–75.
(Brachman, 2005) Brachman R. Getting Back to “The Very Idea” // AI Magazine. 2005. V. 26. No 4. P. 48-50.
(Bobrow, 2005) Bobrow DG AAAI: It's Time for Large-Scale Systems // AI Magazine. 2005. V. 26. No 4. P. 40–41.
(Cassimatis et al., 2006) Cassimatis N., Mueller ET, Winston PH Achieving Human-Level Intelligence through Integrated Systems and Research // AI Magazine. 2006. V. 27. No. 2. P. 12-14.
(Langley, 2006) Langley P. Cognitive Architectures and General Intelligent Systems // AI Magazine. 2006. V. 27. No. 2. P. 33–44.
(Cassimatis, 2006) Cassimatis NL A Cognitive Substrate for Achieving Human-Level Intelligence // AI Magazine. 2006. V. 27. No. 2. P. 45–56.
(Jones and Wray, 2006) Jones RM, Wray RE Comparative Analysis of Frameworks for Knowledge-Intensive Intelligent Agents // AI Magazine. 2006. V. 27.No 2. P. 57-70.
(Cohen 2005) Cohen PR If Not Turing's Test, Then What? // AI Magazine. 2005. V. 26. No. 4. P. 61–67.
(Hall, 2008) J Storrs Hall. Engineering Utopia // Frontiers in Artificial Intelligence and Applications (Proc. 1st AGI Conference). 2008. V. 171. P. 460-467.
(Pankov, 2008) Pankov S. A computational approximation to the AIXI model // Frontiers in Artificial Intelligence and Applications (Proc. 1st AGI Conference). 2008. V. 171. P. 256–267.
(Duch et al., 2008) Duch W., Oentaryo RJ, Pasquier M. Cognitive Architectures: Where Do We Go from Here // Frontiers in Artificial Intelligence and Applications (Proc. 1st AGI Conference). 2008. V. 171. P. 122–136.
(Yudkowsky, 2011) Yudkowsky E. Complex Value Systems n Friendly AI // Proc. Artificial General Intelligence - 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3-6, 2011. Lecture Notes in Computer Science 6830. Springer. 2011. P. 388–393.
(Kurzweil, 2005) Kurzweil R. The Singularity is Near. Viking, 2005.
(Solomonoff, 1964) Solomonoff RJ A formal theory of inductive inference: parts 1 and 2. Information and Control. 1964. V. 7. P. 1–22, 224–254.
(Schmidhuber, 2003) Schmidhuber J. The new AI: General & sound & relevant for physics. Technical Report TR IDSIA-04-03, Version 1.0, cs.AI/0302012 v1, IDSIA. 2003.
(Hutter, 2001) Hutter M. Towards a universal theory of artificial intelligence based on algorithmic probability and sequential decisions. In Proc. 12th European Conf. on Machine Learning (ECML-2001), volume 2167 of LNAI, Springer, Berlin. 2001.
(Hutter, 2005) Hutter M. Universal Artificial Intelligence. Sequential Decisions Based on Algorithmic Probability / Springer. 2005.278 p.
(Wang, 2007) Wang P. The Logic of Intelligence // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer 2007. P. 31–62.
(Schmidhuber, 2007) Schmidhuber J. Gödel Machines: Fully Self-Referential Optimal Universal Self-improvers // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer 2007. P. 199–226.
(Hutter, 2007) Hutter M. Universal Algorithmic Intelligence: A Mathematical Top → Down Approach // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer 2007. P. 227–290.
(Goertzel and Pennachin, 2007) Goertzel B., Pennachin C. The Novamente Artificial Intelligence Engine // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer 2007. P. 63–130.
(Garis, 2007) Hugo de Garis. Artificial Brains // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer 2007. P. 159–174
(Red'ko, 2007) Red'ko VG The Natural Way to Artificial Intelligence // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer 2007. P. 327–352.