About how the eyes perceive the picture

    Ever wonder how an eye reads a picture? Why often looking at the photo we feel how some parts of the image attract the eye so much that it is impossible to break away, focus on other details? As an attempt to answer this question by a group of psychologists and physiologists in the 60s, a theory of visual perception was created. The theory has been developed: at the moment there are at least 3 mathematical apparatuses that allow you to simulate the movement of a pupil reading a picture and associate this movement with concentration of attention on certain parts of the image.

    Two or three years ago, I worked closely on modeling attention when viewing images, and the other day I was asked to show the operation of such a program. I climbed into the dusty corner behind the raw archives, unpacked, began to compile, decided to fix a couple of bugs in the algorithms and ... got carried away! I present to you the fruit of two-day efforts: several pictures and two different ways to simulate how a person perceives a picture.

    Pictures are presented in the form of triptychs. The first part is the original picture. The middle part is the heat map. The more intense the green glow, the more likely it is that this region will attract your attention. The last picture is the dynamics of sight. This model shows how the gaze glides over the image, where it can go further. The gaze moves more easily from light parts to darker and vice versa - in order to shift the focus of attention from darker to lighter parts, some effort will probably be required.

    When viewing pictures, you need to understand that the mathematical apparatus that provides modeling focusing attention of a person does not take into account the psychological aspects of perception, for example, such or such. The pictures show how the human eye moves if it does not detect recognizable images in the picture.


    The first picture shows how a typical photograph of the type “a certain object in the middle of the frame” is perceived. It is especially interesting how the gaze rises to the center, but does not reach it. The sight, as it were, walks, caresses the central area into which the object is inscribed with light touches. In the picture with the dynamics, all compositional features are perfectly visible in the form of secondary focuses of attention and the vector of aspiration upward.


    But the usual landscape. Please note that if you look from the bottom of the picture, you look towards the tree trunks, and if you look from the trees up or from top to bottom, then in the middle of the sky you can clearly see the "potential hole", where the gaze falls involuntarily.


    A bit about the web pages. What parts of the page do you think are most attractive for attention? What is the most important thing to show? Of course, advertising!


    Landscape with a claim to composition. And right away you can see how this composition is littered - to shift these people down and right by just 1 square and the golden section would have been sustained! And so the attention is concentrated between the edge of the picture and the silhouettes of people.


    But I will show this photo only to show how automatic analysis begins to lag due to the psychological aspects of perception. When viewing a photo, people distinguish faces and unconsciously pay more attention to them. A car, in this case, is also perceived as the face of an outlandish beast. If we had eye tracking of this image, we would have noticed that the maximum attention would be “on the forehead” of the Chrysler and on the face of a person. By the way, this also applies to the next picture.


    Our "everything" Mona Lisa. Let's forget that faces attract the eye and see how the picture is perceived, if we consider it as a whole. The heat map will not give us anything here, but the dynamics show interesting things! It turns out that to the right of the face there is a square pointing to the nose with sides proportional to the golden ratio. Not only that, if you look closely at the quadrangular polygons covering your eyes, it turns out that their sides almost correspond to the harmonic series (inconsistencies completely fit into the error of the algorithm)! So envy after this knowledge of Leonardo in geometry ...


    Well, let's move on to my favorite impressionists. The heat map immediately shows that either the algorithm lags, or the picture is too complicated to perceive. Is that why many people take Van Gogh so hard? A bunch of extraneous spotlights creating a grid of almost white noise ... Against this background, it is so difficult to catch flowers in detail, unless the pot lends itself to thoughtful viewing. But everything changes the visualization of the dynamics! It turns out that if you distract from the details of the strokes and perceive the picture from afar, you can see a clear diagonal axis with an entry point on the table and explosive completion in the form of flowers! The picture is not static, the picture lives on!

    Also popular now: