The logic of thinking. Part 10. Spatial self-organization
This series of articles describes a wave model of the brain that is seriously different from traditional models. I strongly recommend that those who have just joined start reading from the first part .
We proceed from the fact that the phenomena of the outside world affect our senses, causing certain flows of signals in nerve cells. In the process of learning, the core acquires the ability to detect certain combinations of signals. Detectors are neurons whose synaptic weights are tuned to activity patterns corresponding to the detected phenomena. Cortical neurons monitor their local environment, forming their local receptive field. Information on the receptive fields of neurons comes either through a topographic projection, or through the propagation of waves of identifiers that carry unique patterns that correspond to already identified features. Detector neurons that respond to the same combination of features form detector patterns. The patterns of these patterns define unique waves of identifiers,
In 1952, Alan Turing published a work entitled The Chemical Fundamentals of Morphogenesis (Turing AM, 1952), devoted to the self-organization of matter. The basic principle formulated by him stated that the global order is determined by local interaction. That is, in order to obtain the structural organization of the entire system, it is not necessary to have a certain global plan, but you can limit yourself solely to setting the rules for close interaction of the elements that make up the system.
Neurons are trained not in isolation, but taking into account the activity of their environment. The rules for accounting for this activity determine the self-organization of the cortex. Self-organization means that in the course of training not only detecting neural patterns are formed, but that these patterns line up in some spatial structures that have their own specific meaning.
The most obvious way to organize yourself is by structuring by proximity. It is possible to train neurons in such a way that patterns corresponding to concepts that are close in a certain sense appear side by side and in the space of the cortex. Later we will see that such placement will be necessary for the implementation of many important functions inherent in the brain. In the meantime, just look at the mechanisms that can provide such an organization.
A rather ambiguous question is how to measure the closeness of concepts. One approach is related to the fact that any concept can be compared with a certain description. For example, the description of the event may be a vector whose components show the severity of certain signs in the event. The set of features in which the description is maintained forms a descriptive basis. In this case, a reasonable measure of proximity is the proximity of descriptions. The closer the description of the two phenomena, the more similar these phenomena are to each other. Depending on the problem, which is further solved using the proximity measure, you can choose one or another calculation algorithm. The measure of proximity is closely related to the concept of distance between objects. One can be counted into another and vice versa.
Another approach is the temporal proximity of phenomena. If two phenomena occur simultaneously, but have different descriptions, then you can still talk about their certain proximity. The more often phenomena occur together, the more their proximity can be interpreted.
In fact, these two approaches do not contradict, but complement each other. If two phenomena often happen together, then we must conclude that there is a phenomenon that is the realization of this compatibility. This new phenomenon itself may be a sign that describes both of the initial phenomena. Now these phenomena have a common in the description, and therefore, proximity in the descriptive approach. We call closeness, which takes into account both descriptive and temporal closeness, generalized proximity.
Imagine that you are looking at an object from different angles. Perhaps these species will be completely different in descriptive terms, but nevertheless, their proximity in some other sense cannot be denied. Temporary proximity is a possible manifestation of this other meaning. A little later, we will develop this topic much deeper, but now we simply note the need to take both approaches into account when determining the proximity of concepts.
The simplest neural network to understand, which well illustrates self-organization according to the degree of descriptive proximity, is Kohonen's self-organizing maps (figure below).
Suppose we have input given by a vector . There is a two-dimensional lattice of neurons. Each neuron is associated with an input vector , this connection is determined by a set of weights. First, we initiate the network with random small weights. By supplying an input signal, for each neuron it is possible to determine its level of activity as a linear adder. Take the neuron that will show the highest activity and call it the winner neuron. Next, we move its weight in the direction of the image to which it turned out to be similar. Moreover, we perform a similar procedure for all its neighbors. We will weaken this shift as we move away from the winning neuron.
here is the learning rate, which decreases with time, is the amplitude of the topological neighborhood (the dependence on n suggests that it also decreases with time).
The amplitude of the neighborhood can be selected, for example, by a Gaussian function:
where is the distance between the corrected neuronand a winning neuron .
Gauss function
As you train, zones that correspond to how the training images are distributed will be highlighted on such a self-organizing map. That is, the network itself will determine when pictures similar to each other meet in the input stream and create close representations for them on the map. Moreover, the more different the images will be, the more separate their representations will be located. As a result, if we appropriately color the result of the training, it will look something like the one shown in the figure below.
Cohanen Map Learning Outcome
Kohanen maps use the amplitude function of a topological neighborhood, which suggests that neurons, in addition to interacting through synapses, can exchange additional information about the nature of surrounding activity, and this information can affect the course of their synaptic learning. The need for the transfer of additional information arose earlier in our model. Describing the training of extrasynaptic receptors, we introduced rules based on the knowledge of certain types of environmental activity by neurons. For example, knowledge of the general level of activity allowed us to make decisions both about the need for training and about refusing it.
In order to stay while building the model within the framework of a certain biological certainty, let us try to show which mechanisms in the real cortex can be responsible for calculating and transmitting additional information that is not encoded into axon signals.
About 40 percent of the brain volume is occupied by glial cells. Their total number somewhere exceeds the number of neurons by an order of magnitude. Traditionally, a lot of service functions are assigned to glial cells. They create a three-dimensional framework, filling the space between neurons. Participate in maintaining homeostasis of the medium. In the process of development of the nervous system involved in the formation of the topology of the brain. Schwann cells and oligodendrocytes are responsible for the myelination of large axons, which leads to multiple acceleration of the transmission of nerve impulses.
Since glial cells do not generate action potentials, they do not directly participate in information interaction. But this does not mean that they are completely devoid of information functions. For example, plasma astrocytes are located in gray matter and have numerous highly branching processes. These processes encircle the surrounding synapses and affect their work (figure below).
Astrocyte and synapse (Fields, 2004)
For example, the following mechanism has been described (RD Fields, B. Stevens-Graham, 2002). Activation of a neuron leads to the release of ATP molecules from its axon. ATP (adenosine triphosphate) is a nucleotide that plays an extremely important role throughout the body, its main function is to ensure energy processes. But besides this, ATP is also able to act as a signal substance. Under its influence, the movement of calcium into the astrocyte is initiated. This, in turn, leads to the fact that the astrocyte releases its own ATP. As a result, this state is transmitted to neighboring astrocytes, which transmit it even further. In this case, the absorption of calcium by astrocyte leads to the fact that it begins to affect the synapses with which it is in contact. Astrocytes are capable of both enhancing the synapse response by ejecting the corresponding neurotransmitter, and weaken it due to its absorption or release of neurotransmitter-binding proteins. In addition, astrocytes are able to secrete signaling molecules that regulate the release of a mediator by an axon. The concept of signal transmission between neurons, taking into account the influence of astrocytes, is called a trilateral synapse.
This interaction of astrocytes and neurons does not transmit specific informational images, but it is very suitable for the role of a mechanism that ensures the spread of the “activity field”, which can control the learning of synapses and thus set the spatial coordinates for new detector patterns.
In addition to astrocytes, the intercellular matrix also affects the behavior of synapses. A matrix is a multitude of molecules produced by brain cells that fill the intercellular space. In the article (Dityatev A., Schachner M., Sonderegger P., 2010) it was shown that a change in the composition of the matrix affects the nature of synaptic plasticity, i.e., the training of neurons.
The impulse activity of neurons creates a point spatial pattern that changes dynamically, encoding information flows. This activity changes the state of the glial environment and the matrix environment in such a way that something like a field of generalized activity is created (figure below).
Point activity and the field of activity The
field of activity, on the one hand, erodes point activity, creating a region that extends beyond the boundaries of active neurons, on the other hand, has inertia and continues to exist for some time after the cessation of impulse activity.
If inside this activity field we create a detector of the current image being fed, then it will be in the vicinity of similar detectors. In this case, similarity can turn out to be both similarity in the receptive field, and similarity arising from the combination of events in time.
In the real brain, spatial organization is best studied for the primary visual cortex. Due to transformations that begin even in the retina, a signal is received on the primary cortex, in which the main information is the lines describing the contours of objects in the original image. The neurons of the primary visual cortex see mainly small fragments of these lines passing through their receptive fields. It is not surprising that a significant part of the neurons in this zone are detectors of lines running at different angles.
It was experimentally revealed that neurons located vertically one under the other respond to the same stimulus. A group of such neurons is usually called a cortical mini-column. Vernon Mountcastle (W. Mountcastle, J. Edelman, 1981) hypothesized that for the brain, the cortical column is the main structural unit of information processing.
Earlier, speaking about patterns of detector neurons, we depicted them as groups of neurons distributed in a certain local area. This was a consequence of the fact that when modeling, and, accordingly, preparing images, flat neural networks were used. The real crust is three-dimensional. The bulk of the cortex does not affect our discussion of the origin and propagation of waves of identifiers. In a three-dimensional cortex, waves propagate in exactly the same way as in a flat one. But in the volumetric cortex, nothing prevents us from arranging neuron detectors that form a single pattern, vertically under each other. This arrangement is neither worse nor better than any other. The main requirement for a pattern is the randomness of its pattern. Since neuron connections are distributed randomly, neurons located vertically in the same cortical column can be considered a random pattern. Such a vertical arrangement is quite convenient when determining the location of the detector pattern. It is not necessary to define a local area, but rather indicate the position in which you want to create a pattern. The choice of position can be, for example, where the activity field is maximum for all free columns of a certain neighborhood. It can be assumed that the cortical minicolumns of the real cortex are the detector patterns described in our model.
For traditional models of the cortex that do not take into account wave signals, the explanation of the fact that all neurons of the mini-column respond to the same stimulus presents a certain difficulty. One has to assume that this is either duplication to ensure fault tolerance, or an attempt to assemble neurons that respond to one stimulus, but are configured to determine it in different positions of the general receptive field. The latter was played out in the neocognitron through the use of planes of simple cells. The wave model allows you to look at it from a different perspective. We can assume that the task of the cortical mini-column is to recognize the characteristic stimulus and launch the corresponding wave identifier. Then, several dozen neurons that form a mini-column and react together, this is the mechanism of triggering the wave, that is, the detector pattern.
The organization of orientation columns in the real visual cortex forms the so-called "tops" ("pinwheel"). Such a “spinning top” (Figure below (B)) has a center where columns of different orientations converge, and diverging tails, which are characterized by a smooth change in the preferred stimulus. One spinning top forms a hypercolumn. The tails of different hypercolumns pass into each other, forming an orientation map of the cortex (figure below (A)).
The distribution of orientation columns in the real cortex obtained by the optical method (Nicholls J., Martin R., Wallas B., Fuchs P., 2003)
Using the example of the visual cortex, it is convenient to compare the actual distribution of orientation columns and simulation results using the activity field.
We will apply lines to the fragment of the cortex at various angles. The activity of detector neurons will be considered as the cosine of the angle between the applied image and the image on the detector.
A neuron with an index has coordinates on the cortex . The value of the activity field from the neuron at the point with coordinates can be represented, for example, through a Gaussian distribution:
At each point of the cortex, the final value of the activity field can be written:
We will place new detectors in such a free position for which the value is maximum. The result of such training is shown in the figure below.
Detector training result (left), activity field (right)
As you can see, the simulation result is very similar to the actual distribution. But it should be noted that the above example is very conditional. In a real crust, each column is somewhat biased relative to its neighbors and follows a slightly different fragment of the image. This creates tails from detectors tuned to the same orientation. But at the same time, the stimuli of such detectors are different images based on different receptive fields.
The spatial self-organization of the crust can be compared with the self-organization of matter in the world around us. Atoms, molecules, objects, planets, stars, galaxies, the universe - all this is a consequence of the existence of four fundamental interactions. It is customary to talk about gravitational, electromagnetic, strong and weak interactions. Particles of matter create fields around themselves that correspond to interactions. Homogeneous fields are summed up and form the resulting field. The resulting fields affect the particles, determining their behavior. Somewhere the same thing is with the brain. It seems that the crust can create several types of fields with different properties. Each of the fields has its own effect on the behavior of neurons. The totality of these interactions, for example,
It is easy to see that the spatial grouping by generalized proximity contains an internal contradiction. In this regard, a joke is recalled. I decided somehow the lion to divide the animals into beautiful and smart. And now there is a bewildered monkey: "And what am I to tear now?" Difficulties arise when a placed concept is close to other concepts that are remote from each other. For example, if we lay books on the shelves of the library, then we would like to group them both by genre, and by author, and by the language of the original, and by the readers rating. But if there is only one book, then a certain problem arises.
In the process of self-organization of the cortex, it may turn out that one and the same concept is close in a generalized sense to several others located in different places of the cortex. It makes no sense to place the detector pattern in a compromise, trying to find a certain equidistant place. Firstly, there may not be such a place, and secondly, this will completely violate the principle of "similar nearby." There is nothing else to do but create several detector patterns, each close to its local maximum of the activity field.
At first glance, duplication of the same concept in different places looks ugly. When working with information, you always want to come to its integration, not spraying. But our neural network model allows you to gracefully remove the contradictions that have arisen. Each of the detector patterns, although it is created in different places of the cortex, is trained on the same wave of identifiers. Moreover, if we take a two-level design, one of the levels fixes on itself a fragment of the wave pattern of a common identifier. This means that the detector patterns scattered in different places are interconnected into a single concept with a common identifier.
This behavior of concepts is a consequence of previously declared dualism. A concept is both a pattern and a wave. Patterns generate a wave, a wave activates patterns. The spatial position of patterns related to one concept characterizes the concept through its proximity to other concepts. The position of the patterns on the cortex, in fact, says a lot about the properties and characteristics of the corresponding phenomenon. But if one area corresponds to one concept in Kohonen’s maps, as it turns out from the “winner takes everything” paradigm, in our model we can not only give a richer description, but also maintain the unity of the components due to dualism.
Self-organization on the principle of "similar nearby" is not an end in itself for the bark. Later we show that the most important algorithms for the brain to work require just such an organization and are practically unrealizable outside it.
And now it’s not quite a banal physical analogy. The main idea of quantum physics is that a quantum system may not be in any possible states, but only in some that are acceptable for it. In this case, the quantum system does not take any definite value until a measurement takes place. Only as a result of measurement does the system pass into one of the states allowed to it. By measurement is meant any external interaction that makes a quantum system manifest itself.
In quantum physics, a complex-valued function is introduced that describes the pure state of an object, which is called the wave function. In the most common Copenhagen interpretation, this function is associated with the probability of detecting an object in one of the pure states (the square of the modulus of the wave function is the probability density). A Hamiltonian system is a dynamical system that describes physical processes without dissipation. Until the quantum system interacts, it is Hamiltonian. It is possible to describe the behavior of such a system in time and space by describing the evolution of its wave function. This evolution is determined by the Schrödinger equation.
where -Hamiltonian is the operator of the total energy of the system. Its spectrum is the set of possible states in which the system may end up after measurement.
When a measurement occurs, the quantum system accepts one of the allowed states. This is called the reduction or collapse of the wave function. During measurement, the system ceases to be isolated, and its energy may not be conserved, since energy is exchanged with the device.
What state the quantum system will take at the moment of measurement is a matter of chance. But the probability of choosing each of the states is not the same, it is determined by the value of the wave function associated with this state.
In our model, detector patterns are created in the positions of the cortex, where there is a descriptive or temporal proximity of a new concept and concepts of existing ones. This forms a set of places that describe the probability of a concept in a given situation. This can be compared with the spectrum of the total energy Hamiltonian of a quantum system. The propagation of waves of identifiers can be compared with the evolution of the wave function. The description, which carries the identifier, goes into the activity of patterns. Further, we show that in this case, not all patterns are activated, but only those that can describe a consistent picture corresponding to the context of what is happening. At this moment, our system goes into one of the possible allowed states, which strongly resembles the collapse of the wave function.
References
Continued
Previous parts:
Part 1. Neuron
Part 2. Factors
Part 3. Perceptron, convolutional networks
Part 4. Background activity
Part 5. Brain waves
Part 6. Projection system
Part 7. Human-computer interface
Part 8. Isolation of factors in wave networks
Part 9 Patterns of neuron detectors. Back projection
Alexey Redozubov (2014)