A bitter lesson in the AI ​​industry

Original author: Rich Sutton
  • Transfer
About the author. Richard Sutton is a professor of computer science at the University of Alberta. It is considered one of the founders of modern computational teaching methods with reinforcement.

According to the result of 70 years of research in AI, the main lesson is that general computational methods are ultimately the most effective. And by a wide margin. Of course, the reason is Moore’s law, or rather, the exponential decline in the cost of computing.

Most AI studies have suggested that the agent has access to ongoing computing resources. In this case, the only way to increase productivity is to use human knowledge. But a typical research project is too short-lived, and after a few years, computer performance inevitably increases.

In an effort to improve in the short term, researchers are trying to apply human knowledge in the subject area, but in the long run only the power of computing matters. These two trends should not contradict each other, but in practice they contradict. Time spent in one direction is time lost for another. There are psychological obligations to invest in one approach or another. And the implementation of knowledge in the subject area tends to complicate the system in such a way that it is less suitable for using general computational methods. There have been many examples where researchers learned this bitter lesson too late, and it is useful to consider some of the most famous.

In computer chess, the system that defeated world champion Kasparov in 1997 was based on a deep search for options. At that time, most computer chess researchers looked at these methods with alarm because they applied a human understanding of the subject area - the special structure of a chess game. When a simpler, search-based approach with specialized hardware and software turned out to be significantly more effective, these researchers refused to admit defeat. They said the brute force method may have worked once, but is not a general strategy. In any case, people do not play chess like that. These researchers wanted winning methods based on a human understanding of the game, but they were disappointed.

A similar situation exists in studies of the game of go, only with a delay of 20 years. The huge initial efforts were aimed at avoiding the search, and using human subject knowledge or game features, but all these efforts were useless when a deep search for options with massive parallel computations was effectively applied. It turned out that self-study was also important for mastering the function of value, as in many other games and even in chess, although this function did not play a big role in the 1997 program, which first won the world champion. Learning in a game with oneself and learning in general are similar to searching in the sense that they allow the use of massive parallel computing. Search and training are the most important applications of computing power in AI research. As in computer chess,

In the 1970s, DARPA held a speech recognition system contest. The competitors proposed many special methods that used knowledge of the subject area - knowledge of words, phonemes, human voice tract, etc. On the other hand, new methods were introduced that were more statistical in nature. They did much more computation based on hidden Markov models (HMMs). And again, statistical methods triumphed over methods based on domain knowledge. This has led to significant changes in all natural language processing. Gradually, over the years, statistics and calculations have become dominant in this area. The recent rise in deep learning in speech recognition is the final step in that direction. Deep learning methods rely even less on human knowledge and use even more computation along with learning on huge data sets. This has greatly improved speech recognition systems. As in games, researchers have always tried to create systems that work on the model of their own minds: they tried to transfer their knowledge of the subject area into their systems. But ultimately, it turned out to be counterproductive and was a huge waste of time when Moore's law made massive calculations available and tools were developed for their effective use. they tried to transfer their domain knowledge to their systems. But ultimately, it turned out to be counterproductive and was a huge waste of time when Moore's law made massive calculations available and tools were developed for their effective use. they tried to transfer their domain knowledge to their systems. But ultimately, it turned out to be counterproductive and was a huge waste of time when Moore's law made massive calculations available and tools were developed for their effective use.

In computer vision, a similar picture. Early methods considered vision as a search for the boundaries of objects, generalized cylinders, or in terms of SIFT signs. But today all this is discarded. Modern neural networks of deep learning use only the concepts of convolution and some invariants, while they work much better.

This is a great lesson. In the industry as a whole, we have not yet fully understood it, as we continue to make the same mistakes. To counter this effectively, you need to understand what makes these mistakes attractive. We have to learn a bitter lesson: building a model of the human mind does not work in the long run. The bitter lesson is based on several historical observations:

  1. Researchers often tried to integrate their knowledge into AI agents.
  2. It always helps in the short term and personally satisfies the researcher, but
  3. In the long run, this approach rests on the ceiling and even slows down further progress.
  4. Breakthrough progress ultimately comes from the opposite approach, based on massive computations through search and training.

The ultimate success is colored by bitterness and is often not fully accepted, because it is a victory over an attractive, person-centered approach.

One lesson must be learned from this bitter experience: one must recognize the enormous power of common methods that continue to scale with increasing computing power, even when huge amounts of computation are required. Search and training seem to be infinitely scalable.

The second general point to be drawn from the bitter lesson is that real human thinking is extremely, irrevocably difficult. We should stop trying to find a simple way to present the contents of the mind as simple models of space, objects, or multiple agents. All this is part of an internally complex external world. This cannot be modeled because the complexity is endless. Instead, meta-methods should be developed that can find and capture this arbitrary complexity. For these methods, it is important that they can find good approximations, but this search is carried out by the methods themselves, and not by us. We need AI agents who can conduct research themselves, and not use the knowledge we have discovered. The construction of the AI ​​system on human knowledge only complicates its training.

Also popular now: