
New Yandex search platform with personal results: Kaliningrad
Today we are announcing important changes in Yandex search. Now the search results and search hints will be personalized and may differ for each user who requests and receives a response from Yandex.
Especially for Habrahabr, we interviewed the people who were involved in this project, and asked them about what it is needed for, how it works, what factors we take into account, and how we measure the benefits from it.
Once upon a time, in order to show a person search results, search engines had enough user query and their own index. These two entities are easy to imagine. But over time, it became clear that there is another very important thing - the context of the request. Who, where and when it sets.
Three years ago, when generating search results, we began to take into account the user's region .
The simplest example of a query where it matters is pizza delivery. People from Moscow and from Volgograd should see links to those delivery companies that work for them in the city. And in some cities where the service is not yet developed, at the request of "pizza" you need to show its recipe altogether. Since the search for Yandex began to take into account the location of a person, queries that specify the region, began to enter 30% less often.
Last summer, we launched the Reykjavik search platform . She understood the language preferences of users and took into account how often a person opens search results in English. To people who are more often looking for English-language resources, the search began to respond with a large number of links to them, and vice versa.
Now we are talking about our next search platform, Kaliningrad, which provides users with personalized search tips and personalized search results.
We will tell you more about each of the parts.
Search as an Internet navigation tool should help a person stay on track and find the shortest routes to a goal. These tasks are solved for several years by search hints. They help to correctly, and most importantly, quickly formulate a request. We talked about the fact that they also learned to take into account a large number of factors that also focus on the context. One of the most important launches, which helped in personalizing the sjest, was to take into account the user's previous request. That is, if a person searched for [Titanic], then when typing the letter “k” in the search line, he will be among the first to see the prompts [Kate Winslet] and [how the Titanic was removed], and not [contact] and [metro map].

Half of all current queries that people ask Yandex are related to the previous one. We learned how to extract data from user behavior that can be used to predict their future behavior. Now the search tips take into account the history of your relationship with Yandex: what queries you asked and which links followed, how these actions were distributed over time.
It should be remembered that the user for Yandex search looks something like this:

That is, we do not store the listed data in explicit form. After they are processed, a relatively small set of numbers is generated for each person. In it, each characterizes a specific topic that our user is interested in. Moreover, it takes into account how much it is now important to him.
It was not immediately possible to make search suggestions more personalized. It seemed that we could identify some clusters of users, for example, using the k-means method . But it turned out that such mechanical methods do not work very well. And we decided to go the other way, highlighting the semantic topics. It turned out that their minimum should be 400,000. The breadth of human interests surprised us as well.
In the process, we also realized how quickly such interests can become obsolete. In fact, even if a person is interested in programming in functional languages, right now he may be worried about repairs in his apartment. And it was important for us to understand that he could consider one thing as his interests, but in reality now it is something else. For development, this meant that it was necessary to organize the delivery and processing of data so that they did not have time to become outdated for this particular user.
In order to understand whether we achieved our goal and whether we were able to make hints so that they can be called personal, we used two methods. First we checked that all this works on historical data. We have some set of actions that users performed before. Using them, we tried to predict the following, as it were. We looked at the sign from which the system recognizes a request without personalization and with it. We included such an option first on five, and then on 10% of users. Next, we compared how they interact with the prompts and the control sample of the same size, but with the old version of the sjest. As you understand, 5-10% of Yandex users are millions of people. The experiment showed that we can already include a new system at all - users liked it.
The second part of the changes we are announcing today is personalized search results. If earlier, as we already said, search results could differ depending on what city or place a person is in, now every person has a chance to get results tailored personally for him.
In fact, now there are as many of their options as there are Yandex search users. Taking into account our knowledge of the person, his interests, what sites he prefers and much more.
In practice, this means that, for example, the response to the query [Northern Lights] for different people will be different. We will show the traveler the answer about a natural phenomenon, a Muscovite who is interested in shopping - a shopping center, a movie buff - links to information about the film.
Personalization allows you to improve answers for 75-80% of each user’s requests. We measured in detail the effect of search improvements through personalization. For example, people click on a personalized first result 37% more often than on a non-personalized one. To achieve this, we conducted experiments with more than 10 different ranking formulas and tuning mechanisms, and more than 50 million users saw experimental results during this time.
Of course, if you wish, personalization can be turned off in the search settings :

According to our estimates, personalization as a whole allows each person using Yandex to save 14% of the time by quickly receiving the answer for which he came.
Especially for Habrahabr, we interviewed the people who were involved in this project, and asked them about what it is needed for, how it works, what factors we take into account, and how we measure the benefits from it.
Once upon a time, in order to show a person search results, search engines had enough user query and their own index. These two entities are easy to imagine. But over time, it became clear that there is another very important thing - the context of the request. Who, where and when it sets.
Three years ago, when generating search results, we began to take into account the user's region .
The simplest example of a query where it matters is pizza delivery. People from Moscow and from Volgograd should see links to those delivery companies that work for them in the city. And in some cities where the service is not yet developed, at the request of "pizza" you need to show its recipe altogether. Since the search for Yandex began to take into account the location of a person, queries that specify the region, began to enter 30% less often.
Last summer, we launched the Reykjavik search platform . She understood the language preferences of users and took into account how often a person opens search results in English. To people who are more often looking for English-language resources, the search began to respond with a large number of links to them, and vice versa.
Now we are talking about our next search platform, Kaliningrad, which provides users with personalized search tips and personalized search results.
We will tell you more about each of the parts.
Personalized Search Tips
Search as an Internet navigation tool should help a person stay on track and find the shortest routes to a goal. These tasks are solved for several years by search hints. They help to correctly, and most importantly, quickly formulate a request. We talked about the fact that they also learned to take into account a large number of factors that also focus on the context. One of the most important launches, which helped in personalizing the sjest, was to take into account the user's previous request. That is, if a person searched for [Titanic], then when typing the letter “k” in the search line, he will be among the first to see the prompts [Kate Winslet] and [how the Titanic was removed], and not [contact] and [metro map].

Half of all current queries that people ask Yandex are related to the previous one. We learned how to extract data from user behavior that can be used to predict their future behavior. Now the search tips take into account the history of your relationship with Yandex: what queries you asked and which links followed, how these actions were distributed over time.
It should be remembered that the user for Yandex search looks something like this:

That is, we do not store the listed data in explicit form. After they are processed, a relatively small set of numbers is generated for each person. In it, each characterizes a specific topic that our user is interested in. Moreover, it takes into account how much it is now important to him.
It was not immediately possible to make search suggestions more personalized. It seemed that we could identify some clusters of users, for example, using the k-means method . But it turned out that such mechanical methods do not work very well. And we decided to go the other way, highlighting the semantic topics. It turned out that their minimum should be 400,000. The breadth of human interests surprised us as well.
In the process, we also realized how quickly such interests can become obsolete. In fact, even if a person is interested in programming in functional languages, right now he may be worried about repairs in his apartment. And it was important for us to understand that he could consider one thing as his interests, but in reality now it is something else. For development, this meant that it was necessary to organize the delivery and processing of data so that they did not have time to become outdated for this particular user.
In order to understand whether we achieved our goal and whether we were able to make hints so that they can be called personal, we used two methods. First we checked that all this works on historical data. We have some set of actions that users performed before. Using them, we tried to predict the following, as it were. We looked at the sign from which the system recognizes a request without personalization and with it. We included such an option first on five, and then on 10% of users. Next, we compared how they interact with the prompts and the control sample of the same size, but with the old version of the sjest. As you understand, 5-10% of Yandex users are millions of people. The experiment showed that we can already include a new system at all - users liked it.
Personalized Search Results
The second part of the changes we are announcing today is personalized search results. If earlier, as we already said, search results could differ depending on what city or place a person is in, now every person has a chance to get results tailored personally for him.
In fact, now there are as many of their options as there are Yandex search users. Taking into account our knowledge of the person, his interests, what sites he prefers and much more.
In practice, this means that, for example, the response to the query [Northern Lights] for different people will be different. We will show the traveler the answer about a natural phenomenon, a Muscovite who is interested in shopping - a shopping center, a movie buff - links to information about the film.
Personalization allows you to improve answers for 75-80% of each user’s requests. We measured in detail the effect of search improvements through personalization. For example, people click on a personalized first result 37% more often than on a non-personalized one. To achieve this, we conducted experiments with more than 10 different ranking formulas and tuning mechanisms, and more than 50 million users saw experimental results during this time.
Features
Of course, if you wish, personalization can be turned off in the search settings :

According to our estimates, personalization as a whole allows each person using Yandex to save 14% of the time by quickly receiving the answer for which he came.