Yandex learned to recognize and combine series of images
Yandex.Pictures today have taken an important step in their development and in the development of image search. Search results will no longer be just a collection of images for the words you specify. Now in the results, some images can be combined together. We call it "series."
"Series" are images that are found on the Internet together and visually similar. The hierarchical clustering algorithm is responsible for the selection of pictures. He takes all the images from one or another page and selects a group of similar ones - those that have a common color, shape, details and so on. If there are at least four similar (but not the same) images on a page, then they form a Series.
Read under the cat why we had the idea to make the Series, how we came up with their design and implemented the algorithm.
The familiar representation of picture search results, familiar to everyone, is a page filled with a grid of thumbnails. Scrolling it down, you usually see new automatically loaded thumbnails of images. But sometimes one or even two pictures are not enough to answer the question - more images are needed. And it would be nice if they were interconnected.
What is it for? For example, to learn how to add origami a dragon, you probably want to see a picture with each step. You need several images in case you want to view the car from different angles. And, probably, it will be great if these are pictures from one review.
To understand what other scenarios the series might have, we conducted in-depth interviews. One girl said that the series would help her quickly find three pictures for a photo frame in the kitchen. It was important for her that they were all with spices and in the same style, because she has a special frame for three images.
There was also a student who said that the instructions in the pictures would help her learn photoshop. Video tutorials are too complicated for a beginner, and pictures with explanations are just that. We talked with a man who said that recently he was looking for instructions with pictures on how to fix a leaking tap. According to our data, to solve such problems, approximately 13% of users look through a picture for a page that has detailed information.
Thanks to quantitative surveys, we know that 70% of users periodically need to find instructions that should have illustrations, and 20% have such a need every week. These statistics are also confirmed by queries - 9% of all searches for searches on Yandex images are related to the search for instructions. And they are on completely different topics.
In the process of discussing this problem, we had the “Series” project. We began to think about how to select images in order to get not only relevant, but also a beautiful answer that would complement the issuance of images.
Thanks to interviews with invited users, UX testing, a lot of discussions within the team and beta data on the Yandex internal network, we built a more coherent understanding of this possibility and formulated some requirements.
For example, the images should be from one page so that the user can go to it and find out more - see the whole report, read the explanations, go to other sections of the site. This improves the navigation scenario in which the user searches for a site through a picture.
A series is especially useful when you single out one object, the steps of one master class or photographs taken in the same style (a specific photo shoot, clothes from the same collection, etc.). But the algorithm does this only for pages without aggressive advertising and viruses. In in-depth interviews and UX testing, we were faced with the fact that users react very negatively to excessive advertising and jumping popups. Therefore, they decided to exclude such pages from candidates for serial.
Group the pictures in a series of similar ones within the pages on which they meet together. That is, if the page www.example.com met pictures
www.example.com/1.jpg ,
www.example.com/2.jpg ,
www.example.com/3.jpg ,
www.example.com/4 .jpg ,
try to combine them in a series.A series is a group of pictures that are pairwise visually similar to each other.
We want to cluster these pictures by visual similarity. That is, find a subgroup of pictures that are fairly similar to each other. We will cluster using the greedy hierarchical clustering algorithm. In English, this algorithm is called complete linkage clustering using nn-chain algorithm. In order to succeed, we need a metric for similarity of pictures, clustering with which would give a cluster group with the properties we need.
What are these properties?
In order for something similar to happen, we chose three types of descriptors:
Descriptions can be found, for example, in the mpeg-7 standard. We have our own fast and efficient implementations.
Based on these descriptors, visual similarity is calculated, as the maximum in visual similarity for individual descriptors. This allows you to satisfy the first three Wishlist. To take into account the dimensions, we subtract from the visual similarity the ratio of the areas of the pictures (max / min). Based on this metric, we carry out clustering.
In the process, we realized that serial images on the output should look like a single block - this is a new representation of the Yandex response. The most important thing in the design was to make the series of pictures visible.
Since the beginning of work on the project, we have tried about a dozen different design options. Three of them were tested both on external users and on our colleagues.
We saw that outwardly, users like more than one design.
But they understand a different view, and they interact with it more.
In the final version, we took into account the best of both options.
We also noticed that it is important to show people when the series opened and when it ended. So the final version with information and sharing block appeared.
The series will be very convenient for those people who are looking for step-by-step instructions in which pictures are more important than text: how to make an origami dragon, how to draw a cat, a decoupage master class, how to change or find exercises for a trapeze.
With them it is easier to search for images similar in style - pictures of one artist or a selection of photos from one photo shoot. They will also help to consider a car, product or attraction from different sides.
The familiar representation of picture search results, familiar to everyone, is a page filled with a grid of thumbnails. Scrolling it down, you usually see new automatically loaded thumbnails of images. But sometimes one or even two pictures are not enough to answer the question - more images are needed. And it would be nice if they were interconnected.
What is it for? For example, to learn how to add origami a dragon, you probably want to see a picture with each step. You need several images in case you want to view the car from different angles. And, probably, it will be great if these are pictures from one review.
To understand what other scenarios the series might have, we conducted in-depth interviews. One girl said that the series would help her quickly find three pictures for a photo frame in the kitchen. It was important for her that they were all with spices and in the same style, because she has a special frame for three images.
There was also a student who said that the instructions in the pictures would help her learn photoshop. Video tutorials are too complicated for a beginner, and pictures with explanations are just that. We talked with a man who said that recently he was looking for instructions with pictures on how to fix a leaking tap. According to our data, to solve such problems, approximately 13% of users look through a picture for a page that has detailed information.
Thanks to quantitative surveys, we know that 70% of users periodically need to find instructions that should have illustrations, and 20% have such a need every week. These statistics are also confirmed by queries - 9% of all searches for searches on Yandex images are related to the search for instructions. And they are on completely different topics.
In the process of discussing this problem, we had the “Series” project. We began to think about how to select images in order to get not only relevant, but also a beautiful answer that would complement the issuance of images.
Thanks to interviews with invited users, UX testing, a lot of discussions within the team and beta data on the Yandex internal network, we built a more coherent understanding of this possibility and formulated some requirements.
For example, the images should be from one page so that the user can go to it and find out more - see the whole report, read the explanations, go to other sections of the site. This improves the navigation scenario in which the user searches for a site through a picture.
A series is especially useful when you single out one object, the steps of one master class or photographs taken in the same style (a specific photo shoot, clothes from the same collection, etc.). But the algorithm does this only for pages without aggressive advertising and viruses. In in-depth interviews and UX testing, we were faced with the fact that users react very negatively to excessive advertising and jumping popups. Therefore, they decided to exclude such pages from candidates for serial.
How Series is Technically Designed
Group the pictures in a series of similar ones within the pages on which they meet together. That is, if the page www.example.com met pictures
www.example.com/1.jpg ,
www.example.com/2.jpg ,
www.example.com/3.jpg ,
www.example.com/4 .jpg ,
try to combine them in a series.
We want to cluster these pictures by visual similarity. That is, find a subgroup of pictures that are fairly similar to each other. We will cluster using the greedy hierarchical clustering algorithm. In English, this algorithm is called complete linkage clustering using nn-chain algorithm. In order to succeed, we need a metric for similarity of pictures, clustering with which would give a cluster group with the properties we need.
What are these properties?
- The same objects or scenes shot from different angles must be in the same cluster;
- The same objects or scenes made in different colors on the image must be in the same cluster;
- Photos from the same photo shoot, in which there are enough common colors and details, should be in the same cluster.
- The cluster should have images of approximately the same size, etc.
In order for something similar to happen, we chose three types of descriptors:
- descriptor based on color layout descriptor ;
- mpeg dominant color descriptor ;
- mpeg-7 edge histogram descriptor.
Descriptions can be found, for example, in the mpeg-7 standard. We have our own fast and efficient implementations.
Based on these descriptors, visual similarity is calculated, as the maximum in visual similarity for individual descriptors. This allows you to satisfy the first three Wishlist. To take into account the dimensions, we subtract from the visual similarity the ratio of the areas of the pictures (max / min). Based on this metric, we carry out clustering.
Design
In the process, we realized that serial images on the output should look like a single block - this is a new representation of the Yandex response. The most important thing in the design was to make the series of pictures visible.
Since the beginning of work on the project, we have tried about a dozen different design options. Three of them were tested both on external users and on our colleagues.
We saw that outwardly, users like more than one design.
But they understand a different view, and they interact with it more.
In the final version, we took into account the best of both options.
We also noticed that it is important to show people when the series opened and when it ended. So the final version with information and sharing block appeared.
Where the series will help you
The series will be very convenient for those people who are looking for step-by-step instructions in which pictures are more important than text: how to make an origami dragon, how to draw a cat, a decoupage master class, how to change or find exercises for a trapeze.
With them it is easier to search for images similar in style - pictures of one artist or a selection of photos from one photo shoot. They will also help to consider a car, product or attraction from different sides.