Everything you need to know about AI in a few minutes

Greetings to readers Habr. Your attention is invited to the translation of the article "Everything you need to know about AI - in under 8 minutes." . The content is aimed at people who are not familiar with the field of AI and who want to get a general idea about it, in order to, possibly, go deeper into one of its specific industries.

To know a little about everything sometimes (at least for beginners trying to navigate popular technical directions) is more useful than knowing a lot about one thing.

Many people think they are a little familiar with AI. But this area is so young and growing so fast that breakthroughs are made almost every day. There is so much to be discovered in this scientific area that specialists from other areas can quickly join AI research and achieve significant results.

This article is for them. I set a goal to create a short reference material that will allow technically educated people to quickly understand the terminology and tools used to develop AI. I hope that this material will be useful to most people interested in AI, who are not experts in this field.

Introduction

Artificial Intelligence (AI), machine learning, and neural networks are terms used to describe powerful, machine learning-based technologies that can solve many real-world problems.

While thinking, making decisions, etc. in comparison with human brain abilities, machines are far from ideal (they are not perfect, of course, in humans), in recent times several important discoveries have been made in the field of AI technologies and associated algorithms. An important role is played by the increasing number of large data samples available for training AI.

The field of AI intersects with many other areas, including mathematics, statistics, probability theory, physics, signal processing, machine learning, computer vision, psychology, linguistics, and the science of the brain. Issues related to social responsibility and ethics of creating AI attract interested people involved in philosophy.

The motivation for the development of AI technologies is that tasks that depend on a variety of variable factors require very complex solutions that are difficult to understand and difficult to algorithmize manually.

There are growing hopes of corporations, researchers and ordinary people for machine learning to obtain solutions to problems that do not require a person to describe specific algorithms. A lot of attention is paid to the black box approach. Programming algorithms used for modeling and solving problems associated with large amounts of data, developers take a very long time. Even when we manage to write code that processes a large amount of various data, it is often very cumbersome, difficult to maintain and hard to test (because of the need even to use a large amount of data for tests).

Modern machine learning technologies and AI, together with correctly selected and prepared “training” data for systems, can allow us to teach computers to “program” for us.

Overview

Intelligence - the ability to perceive information and store it as knowledge for building adaptive behavior in an environment or context.

This definition of intelligence from (English-language) Wikipedia can be applied to both the organic brain and the machine. The presence of intelligence does not imply the presence of consciousness . This is a common misconception brought to the world by science fiction writers.

Try searching the Internet for examples of AI - and you will probably get at least one link to IBM Watson using a machine learning algorithm that became famous after winning a quiz show called “Jeopardy” in 2011. Since then, the algorithm has undergone some changes and was used in as a template for many different commercial applications. Apple, Amazon and Google are actively working to create similar systems in our homes and pockets.

Natural language processing and speech recognition were the first examples of the commercial use of machine learning. Following them, tasks appeared other tasks of automation of recognition (text, audio, images, video, faces, etc.). The range of applications of these technologies is constantly growing and includes unmanned vehicles, medical diagnostics, computer games, search engines, spam filters, crime control, marketing, robot control, computer vision, transportation, music recognition and much more.

AI is so tightly integrated into modern technologies that we use that many do not even think of it as an “AI,” that is, they do not separate it from conventional computer technologies. Ask any passerby if there is artificial intelligence in his smartphone, and he will probably answer: “No”. But AI algorithms are everywhere: from predicting the entered text to the automatic focus of the camera. Many believe that AI should appear in the future. But he appeared some time ago and is already here.

The term "AI" is rather generalized. Most studies now focus on a narrower field of neural networks and deep learning.

How our brain works

The human brain is a complex carbon computer that performs, according to rough estimates, a billion billion operations per second (1000 petaflops), while consuming 20 watts of energy. The Chinese supercomputer called “Tianhe-2” (the fastest in the world at the time of this writing) performs 33860 trillion operations per second (33.86 petaflops) and consumes 17600000 watts (17.6 megawatts). We have to do a certain amount of work before our silicon computers can compare with the carbon ones formed as a result of evolution.

The exact description of the mechanism used by our brain to “think” is the subject of discussion and further research (I personally like the theory that brain work is related to quantum effects, but this is a topic for a separate article). However, the mechanism of operation of parts of the brain is usually modeled using the concept of neurons and neural networks. It is assumed that the brain contains approximately 100 billion neurons.

Neurons interact with each other using special channels that allow them to exchange information. The signals of individual neurons are weighted and combined with each other before activating other neurons. This processing of transmitted messages, the combination and activation of other neurons is repeated in different layers of the brain. Considering the fact that there are 100 billion neurons in our brain, the totality of weighted combinations of these signals is rather complicated. And that's putting it mildly.

But this does not end there. Each neuron applies a function, or transform, to the weighted input signals before checking whether its activation threshold has been reached. The conversion of the input signal can be linear or non-linear.

Initially, input signals come from a variety of sources: our senses, means of internal tracking of the functioning of the body (the level of oxygen in the blood, the contents of the stomach, etc.) and others. A single neuron can receive hundreds of thousands of input signals before deciding how to respond.

Thinking (or information processing) and the resulting instructions of it, transmitted to our muscles and other organs are the result of the transformation and transmission of input signals between neurons from different layers of the neural network. But neural networks in the brain can change and update, including changes in the weighting algorithm of signals transmitted between neurons. This is due to learning and experience.

This model of the human brain was used as a template to reproduce the capabilities of the brain in a computer simulation - an artificial neural network.

Artificial Neural Networks (INS)

Artificial Neural Networks are mathematical models created by analogy with biological neural networks. ANNs are capable of modeling and processing non-linear relationships between input and output signals. Adaptive weighting of signals between artificial neurons is achieved through a learning algorithm that reads the observed data and tries to improve the results of their processing.

To improve the performance of the INS, various optimization techniques are used. Optimization is considered successful if the INS can solve the task set for a time not exceeding the established limits (the time frames, of course, vary from task to task).

ANN is modeled using several layers of neurons. The structure of these layers is called the model architecture. Neurons are separate computational units that can receive input data and apply some mathematical function to them to determine whether to transfer this data further.

In a simple three-layer model, the first layer is the input layer, followed by a hidden layer, followed by the output layer. Each layer contains at least one neuron.

With the increasing complexity of the model structure by increasing the number of layers and neurons, the potential for solving ANN tasks increases. However, if a model is too “big” for a given task, it may be impossible to optimize it to the desired level. This is called re-education ( overfitting).

The architecture, tuning and selection of data processing algorithms are the main components of the construction of the INS. All these components determine the performance and efficiency of the model.

Models are often characterized by the so-called activation function . It is used to convert the weighted input data of a neuron into its output data (if the neuron decides to transmit data further, this is called its activation). There are many different transformations that can be used as activation functions.

ANNs are powerful tools for solving problems. However, although the mathematical model of a small number of neurons is quite simple, the model of a neural network with an increase in the number of its constituent parts becomes rather confusing. Because of this, the use of ANNs is sometimes called the “black box” approach. The choice of the INS for solving the problem should be carefully considered, since in many cases the resulting final decision cannot be disassembled and analyzed why it has become so.

Deep learning

The term in- depth training is used to describe neural networks and the algorithms used in them that accept raw data (from which you need to extract some useful information). This data is processed by passing through the layers of the neural network to obtain the desired output.

Learning without a teacher ( unsupervised learning) - an area in which deep learning techniques show themselves well. A properly configured ANN is able to automatically determine the main features of the input data (whether it is text, images or other data) and get a useful result of their processing. Without deep learning, the search for important information often falls on the shoulders of a programmer who develops a system for processing them. The model of deep learning is independently capable of finding a way to process data, which allows extracting useful information from them. When the system is trained (that is, it finds the same way to extract useful information from the input data), the requirements for computing power, memory, and energy to keep the model working are reduced.

Simply put, learning algorithms allow using specially prepared data to “train” a program to perform a specific task.

Deep learning is used to solve a wide range of tasks and is considered one of the innovative AI technologies. There are also other types of training, such as supervised learning ( supervised learning ) and training with partial involvement of teachers ( the semi-supervised learning ), which differ in the introduction of additional human control over the intermediate results of training the neural network data processing (to help determine the right direction moving system).

Shadow education ( shadow learning) - the term used to describe a simplified form of deep learning, in which the search for key features of data is preceded by their processing by a person and entering into the system specific to the sphere to which this data relates. Such models are more “transparent” (in the sense of obtaining results) and high-performance due to the increase in time invested in the design of the system.

Conclusion

AI is a powerful data processing tool and can find solutions to complex problems faster than traditional algorithms written by programmers. INS and deep learning techniques can help solve a variety of different problems. The downside is that the most optimized models often work as “black boxes”, making it impossible to study the reasons for their choice of a particular solution. This fact can lead to ethical issues related to the transparency of information.

Tags: