What do neural networks really hide?

    A few days ago, an article appeared on the hub: What do neural networks hide? . She is a free retelling of the English article The Flaw Lurking In Every Deep Neural Net , and she, in turn, talks about a specific study of some properties of neural networks ( Intriguing properties of neural networks ).

    In the article describing the study, the authors took a somewhat sensational approach to the presentation of the material and wrote a text in the spirit of “a serious problem was found in neural networks” and “we cannot trust neural networks for security-related problems”. Many of my acquaintances shared a link to a post on Habré among my friends; several discussions on this topic started on Facebook at once. At the same time, I got the impression that in two retelling some of the information from the initial study was lost, plus a lot of questions arose about neural networks that were not considered in the original text. It seems to me that there is a need to describe in more detail what they did in the study, and at the same time try to answer the initial questions. Facebook format for such long texts is not suitable at all,

    Original article content

    The original article, entitled “Intriguing Properties of Neural Networks,” was written by a group of seven scientists, three of whom work in the neural network research department at Google. The article discusses two non-obvious properties of neural networks:

    • It is believed that if for a particular neuron at a deep level of the neural network, source images are selected so that this neuron is activated, then the selected images will have some common semantic attribute. Scientists have shown that the same statement is true if we consider not the activation of one neuron, but a linear combination of the outputs of several neurons.
    • For each element of the training sample of the neural network, one can choose a very similar visual example that will be classified incorrectly - this is called blind spot by the researchers.

    Let's try to understand in more detail what these two properties mean.

    The value of specific neurons

    The first property will try to make out quickly.

    There is an assumption popular among fans of neural networks, which consists in the fact that the neural network inside itself parses the source data into separate understandable properties and at the deep levels of the neural network each neuron is responsible for some specific property of the original object.

    This statement is usually checked by visual inspection:

    1. A neuron in a trained network is selected.
    2. Images from a test sample that activate this neuron are selected.
    3. Selected images are viewed by a person and it is concluded that all these images have some common property.

    What the researchers did in the article: instead of looking at individual neurons, they began to consider linear combinations of neurons and search for images that activate a specific combination, general semantic properties. The authors succeeded - they conclude from this that the data on the subject area in the neural network are not stored in specific neurons, but in the general network configuration.

    Generally speaking, I don’t really want to seriously discuss this part of the article, because it relates more to the field of religion than to science.

    The initial argument that specific neurons are responsible for specific signs is taken from very abstract reasoning that a neural network should resemble the human brain in its work. Whence comes the assertion that the distinguished features should be understood by a person, I could not find anywhere at all. Checking this statement is a very strange exercise, because it is easy to find a common feature in a small selection of arbitrary images, if only there was a desire. And it is impossible to conduct a statistically significant check on a large volume of images, since the process cannot be automated. As a result, we got a logical result: if you can find common signs on one set of images, then you can do the same on any other.

    An example of images with the same property from the original article.

    The general conclusion in this part of the article really looks logical - in the neural network, knowledge about the subject area is really rather contained in the entire architecture of the neural network and the parameters of its neurons, and not in each specific neuron individually.

    Network blind spots

    The researchers conducted the following experiment - they set out to find objects that are incorrectly classified by the network, located as close as possible to the objects of the training sample. To search for them, the authors developed a special optimization algorithm that departed from the original image in the direction of deterioration of the responses of the neural network until the classification of the object broke.

    The experiment resulted in the following:

    • For any object of the training sample, there is always a picture that a person does not distinguish with his eyes from the first, and the neural network breaks on it.
    • Pictures with introduced defects will be poorly recognized by the neural network, even if it has an architecture change or is trained on another subset of the training set.

    Actually, about these blind spots and there is mainly talk, so let's try to answer the questions that appear at the same time. But first, let's look at a few basic objections that appear in people reading a study description:

    • “The study uses very simple neural networks, now no one uses them” - no, the study used 6 different types of networks, from simple single-layer to large deep networks, all of other well-known works of the last 2-3 years. Not all experiments were performed on all types of networks, but all the main conclusions in the article are independent of the type of network.
    • “Researchers use a neural network that receives raster images at the input, and not signs highlighted on them - this is initially ineffective” - the article actually never explicitly says what exactly they transmit to the input of the neural network. At the same time, their neural networks show good quality on large image bases, so it is difficult to blame them on the inefficiency of the original system.
    • “Researchers took a very retrained neural network - they naturally got bad results outside the training set” - no, the results they give show that the networks they trained were not retrained. In particular, in the article there is the result of the network working on the initial sample with added random noise, on which there is no drop in quality. With a retrained system, there would be no such results.
    • “Distortions that are added to networks are very special and cannot be met in real life” - not quite so. On the one hand, these distortions are not accidental, on the other hand, they change the picture very slightly, on average an order of magnitude less than the random noise that is invisible to the human eye - the article has the corresponding numbers. So I would not argue that such distortions could not be obtained in reality - the probability of this is small, but such possibilities cannot be ruled out.

    What is the real news here?

    The fact that the neural network may have blind spots near the objects of the training sample is not really big news. The fact is that in the neural network no one ever promised local accuracy.

    There are classification methods (for example, Support Vector Machines ), which, based on their training, put the maximum separation of the objects of the training sample from the boundaries of the class changes. In neural networks, there are no requirements of this kind, moreover, due to the complexity of neural networks, the final division of the initial set is usually not amenable to normal interpretation and research. Therefore, the fact that in the networks one can find areas of local instability is not news, but a confirmation of a fact that was already well known.

    What is real news here is that the distortions that lead to errors retain their properties when switching to a different network architecture and when changing the training sample. This is indeed a very unexpected discovery, and I hope that the authors in the following works will find an explanation for it.

    Is neural networks really a dead end?

    No, neural networks are not a dead end. This is a very powerful and powerful tool that solves a certain set of very specific tasks.

    The popularity of neural networks is based on two ideas:

    • Rosenblatt Perceptron Convergence Theorem - for any training set, one can choose the architecture and weights of a neural network with one inner layer, such that the training set is classified with 100% accuracy.
    • Almost all processes in the training of a neural network (recently including the selection of architecture) are fully automated.

    Therefore, a neural network is a means of quickly obtaining acceptable solutions for very complex recognition problems. No one has ever promised anything else to neural networks (although there have been many attempts). The keywords here are “fast” and “difficult tasks”:
    • If you want to learn how to consistently distinguish cats from dogs on YouTube for a year of work, then besides neural networks you now still don’t have tools comparable in quality and convenience - inventing signs for simpler classifiers and setting them up will take much longer. But at the same time it will be necessary to put up that the black box of a neural network will sometimes make mistakes, strange from the point of view of the person, which will be difficult to correct.
    • And if you want to recognize text or distinguish positive reviews from negative ones, take a better classifier - you will have much more control over what is happening, although it may take some time to get the first results.

    Can you trust neural networks?

    The main conclusion of the article discussing the original study was: “Until this happens, we cannot rely on neural networks where safety is critical ...” Then, in separate discussions, Google Car often popped up, for some reason (apparently because of the authors' place of work and the car’s picture in the article).

    In fact, you can trust neural networks, and there are several reasons for this:

    1. It is important for the user (not the researcher) of the neural network where exactly it is mistaken, but how often. Believe me, it will absolutely not matter to you, your automatic car did not recognize the truck that was in its training base, or the one that it had not seen before. The entire study is devoted to the search for errors in specific areas near the training sample, while the overall quality of the work of neural networks (and methods for its evaluation) are not called into question.
    2. Any recognition system never works 100%, it always has errors. One of the first principles that are recognized in robotics is that you can never do actions on the basis of one separate indicator of the sensor, you always need to take a floating window of values ​​and throw out the freaks from there. For any critical system, this is also true - in any real task there is always a stream of data, even if at some point the system crashes, neighboring data will correct the situation.

    So, neural networks in any critical system should be treated as another type of sensor, which as a whole gives the correct data, but sometimes it is mistaken and it is necessary to lay on its errors.

    What is important in this article?

    It would seem that if no great revelations were found in the article, then why was it written at all?

    In my opinion, the article has one main result - this is a thoughtful way to significantly increase the quality of the neural network during training. Often when training recognition systems, a standard trick is used when, in addition to the original objects of the training set, the same objects with added noise are used for training.

    The authors of the article showed that instead you can use objects with distortions that lead to errors in the neural network and thus eliminate errors on these distortions, and at the same time improve the quality of the entire network in the test sample. This is an important result for working with neural networks.

    In the end, I can only recommend not reading articles with "sensational" headlines, but it is better to find the source and read them - everything is much more interesting there.

    Also popular now: