Search Technology at Airbnb
Three weeks ago, we wrote about how users and homeowners can more effectively use the search on our site. Today we want to talk about the algorithms on which our search engine is based.
The post was prepared based on the presentation by Maxim Charkov:
Search is the power of matching. Essentially, here we are trying to match the requests of our users with what is available on the market.
First, I would like to say a few words about myself and my colleagues. I work in a search team. I started working for the company two years ago. Prior to that, I was a Google employee, where I spent several years doing everything in a row, from search functions to web browsers. Of course, everything that I am going to present here would not have been possible without people from our team. Search Airbnb is an ongoing team. Our engineers work on the problems of search and booking flow, including infrastructure, user interface, etc. Our business area also includes equipment development, design, user research, data processing and analysis.
First I want to introduce the search problem on Airbnb and how we help our guests find the best positions. Then I will talk about the problem of booking conversion. You will see that on Airbnb there are not always enough positions to satisfy all user requests. And this is an interesting task. I will also say a few words about the evaluation of modifications. When working on new search products, it is very important to establish evaluation tools and factors that would ensure that every change made will have a positive effect on the user.
Airbnb is a global ad marketplace. Today in our database more than 600 thousand positions, 34 000 cities in 192 countries of the world. On our site you can find accommodation for every taste, from ordinary apartments to tree houses, and even private islands.
Here you can see the density of our positions in the San Francisco area. Each point is a corresponding sentence.
North America: Europe also represents a dense market. The Asia-Pacific region is rapidly gaining momentum. We need new solutions for working on a global scale, since we have all this huge variety of locations and types of offers.
Let's pay attention to the search behavior of people who have successfully booked on Airbnb. Travelers spend a lot of time searching. On average, each of them spends at least three days from the moment of the first search request to complete the reservation. During this time, they consider approximately 20 positions. The process of finding a place to live is not easy in itself for many reasons. Choosing a place is a serious decision. Often people want to consult with family or friends before making the final choice.
Unlike web search (before that I worked with it), you will almost never see the full picture of the search, when the user simply enters a query, selects the first one in the results list and makes a reservation. Even if the first result is objectively the best option for the user, people still want to consider a few others before making a choice.
Now let's see how the user interacts with search engines. They usually start from the page where they enter the search parameters: place, date of travel and the number of tourists. Then they will see the most suitable positions.
This is what the user sees, but behind the scenes we have a sophisticated ranking system that combines thousands and thousands of signals that make it clear that these proposed options are exactly what our guest wants.
Our task, as a search team, is to help the user achieve their goals. That is, we must make the process of finding a place to live as simple as possible. Let's start with the Find section. We constantly measure how accurately the result of delivery corresponds to user requests. The most effective is to calculate the total number of users who have made a search query and those who have finished booking on Airbnb. You can increase the last indicator only by increasing the quality, relevance and personalization of search results for a specific user.
Therefore, we must first understand the concept of quality: this, in essence, is the attractiveness of the position from the point of view of the request. We have a model that calculates quality for each search result, and then we use this indicator when ranking companies. In a broad sense, the model looks like two groups of signals. One group is the position attributes: pictures, number of reviews, ratings, location attractiveness, estimated cost, etc. And the other is behavioral signals: the interaction of the user and the positions on our site.
They are unspoken user reviews of search results, and therefore are very useful to us. Let's take a look at this sample search page. We can assume that a typical user views the results from top to bottom, black arrows indicate the order in which the user considers the search results. If he believes that the position is worth his attention, then click on it to receive more detailed information, or immediately proceed to the next. Thus, we can already try to build a ranking factor by counting the number of clicks that each result received, but here it is worth making a small correction. Let's look at the results, namely at number 7.
Would it be true to say that the user saw the last position and considered it the most suitable? It is also possible that the guest has already found the best result among the 7 above. Or maybe he even decided to refuse the request or change it. To prevent an unfair exclusion from the results, for each individual page we determine the position of the last click. In this case, this is result number 7. Thus, when creating a ranking factor, the last considered result is taken into account, as well as all higher links opened by the user.
Of course, just a click is not enough as such. We should not limit all behavioral signals when searching. For example, we look at the page and try to evaluate how much time it took the guest to take any action after a brief look at the page with search results.
A behavioral approach to quality can be very effective. There are some suggestions that received very low marks, but according to the behavioral signals listed above, their score was extremely high. At first you can find some low-quality results, like this cave in Berlin. This is probably not a true position at all. Sometimes the type of position may be unexpected: a typical Airbnb user did not expect to find a parking place. Here's another example: we have a garage. And the car!
Now let's take a look from the other side. You can see what, as a rule, is the difference between the available positions: the attractiveness of the photo, competitive cost and high rating. Of course, sometimes we are unable to make a correct assessment, relying only on user behavior, so we combine behavioral signals with more explicit factors, such as feedback, to calculate the final quality score.
Oddly enough, we pay more attention to quality than to the very result of delivery. For business travelers, location is crucial. We always ask our users to fill out a small survey form upon completion of using the service. And in one of the points we ask the user to evaluate the location (location). Thus, we can use these answers to see the algorithm, and then, if you like, evaluate the quality of the location of any position, view the estimates of nearby positions. Here, on the left, we have a quality score for San Francisco that is based on knn. And on the right we have the position estimates in San Francisco, obtained from another card.
When considering a location, it is important to consider not only quality, it should also be relevant to the request. Here we see positions for "Santa Cruz, CA".
Obviously, all the accommodation offered is in Santa Cruz, but what if the user wanted to stay in Santa Cruz County? The second position here is from Airbnb, it has 500 views. But its location is Aptos, which is nearby, but still it is not Santa Cruz, and few would have guessed to look for this particular place. Of course, we could say something like “If the request is“ Santa Cruz, then Aptos is also suitable. ” But this problem remains unresolved only because we work around the world. In addition, we really do not want to make a choice for our users. On the contrary, we want to learn on the basis of their actions, reasoning regarding each request. Also, queries can be much more complicated than just city names. The user can search for the neighborhood of the village, suburb, postal address, country or state.
Since we want to learn about relevance from our users, we must understand what exactly people consider relevant, suitable specifically for them. After we group the user action logs, another problem arises. Individual users can go to the site several times to solve various problems. For example, a guest visits the site and enters a Pacific request, then decides to make a reservation. The next day, he returns and seeks accommodation in Los Angeles, and bookes in Santa Monica. And we are trying to share such actions.
Another important step is canonicalization. It is necessary to clear data, especially to correct errors and spelling differences. For example, “SF” is San Francisco, however, you must correctly enter “San Francisco, CA, USA” in the request. We fixed a spelling error for Germany and entered the Japanese spelling of the query “Buenos Aires” into the database. After we have session data, the computer starts generating signals. One of them is the conditional probability of an order in a given period of time, taking into account the specifics of a particular request. This effectively reflects how many people were looking for something already booked in a specific locality.
For example, if you analyze reservations in the city of Aptos, it turns out that people found this position at the request of “Santa Cruz”. This indicates a strong connection between the two places. As well as data signals, the number of positions in this location, and the distance between the place of booking and the settlement in the request.
This is the real data for Pacific, California. You can see some interesting features. For example, most people looking for accommodation in Pacifica make reservations in San Francisco. This may seem strange, but this is due to the fact that in Pacific Airbnb is still not very popular. On request, only about 20 items are displayed. But San Francisco will offer you a huge selection. Even if the user wants to stay in Pacifica, often in the end he will still book accommodation in San Francisco. We can finally dispel doubts by taking into account the size of cities.
Look at the second table. A large number of users who decided to stay in El Granada, before that they wanted to choose Pacific. On the last chart you can see the combined score. Not only can you find that Pacific is relevant to Pacific, we also have tons of alternative places like El Granada, etc. You can see them all on the map. People who were looking for accommodation here could also consider options in a number of other places. Showing all cities located near Pacific, which are quite consistent with the request.
In addition, we are trying to personalize all the results for a particular guest. For this we use social graphs. The fact that your friend looked at the search results and then selected a specific position makes it more relevant to you. We also try to find the results with the highest probability of meeting the guest and the host in real life. For example, you may have a mutual friend with the owner, or he went to the same university as you.
Our task is to facilitate the search and booking of housing for our users as much as possible. We are discussing how we can help them find the best place. But how can we be sure that they will actually complete the order? Airbnb is similar to other travel sites.
Here is an illustration of the reservation flow. First you have to find an attractive position. Next, you contact the owner. If he agrees to accept you, then you book. The site supports hot keys for quick access to individual functions. For example, there is an instant booking option that the owner can activate, but in most cases you still need confirmation from the owner of the property. The owner may refuse to accept you or simply ignore you. This is really unpleasant when you look for a suitable position for three days, and then you are simply ignored.
We track actions from the moment of establishing contact with the owner of the house to confirming the guest’s candidacy. Essentially, Contact to Accept is the relationship between users whose residence has been verified by the owners and those who have tried to contact the owners.
We have come quite far in recent years.
We did some research to determine why some guests were rejected. And again, we have two sets of signals: one “material and technical”, the second “emotional”. The most common technical reason is accessibility. For example, the owner did not update the calendar time or simply did not mark the availability of housing on certain days on the calendar. Or he is not able to satisfy the wishes of the guest: the length of stay, the time required before or after check-in. Some owners set a minimum stay of three nights, for example, or entry only on weekends. Of course, there is always an emotional component. Each owner has a preference according to which he decides whether to accept you or not.
We do a lot to improve performance. For example, we are trying to connect the owner with a guest whom he would most likely agree to serve. A lot of effort is also spent on optimization: a balance of attractiveness and bookability (armor, armor) - how often reservations are made, since a less attractive position can actually be better. And, of course, the user interface should be conducive to interacting with the site.
This is our attempt to significantly personalize the product for our owners. There are several important points for this. First: just because the host accepts an average of 50% of applicants does not mean that any request will be equally accepted or rejected. In fact, the owner has his own preferences, the guest may not be suitable for personal reasons and rejected without explanation. Different hosts have different preferences. One example is our Miami homeowner. He prefers to receive guests for long periods of time and rejects all requests for accommodation for several days. He will not agree to accept a person for 2 nights if this prevents him from accepting other tourists for a longer period.
Another important factor for making a decision is whether a specific guest request fits into the host’s calendar. Suppose this is my calendar. I have a reservation from the 21st to the 24th. There is one more reservation. Between them, I have three free days, from February 25 to February 27. Suppose I have a one-day stay request for February 26th. If I accept it, this will mean that I will have gaps in the schedule and have to look for two more tenants for two free days and nights. For this reason alone, it would be entirely advisable to reject this request. Although the other owner might prefer to have free time between bookings and accept this offer.
At Airbnb, we use information about past host behavior, failures, and accepted requests to personalize the site’s response pattern to an action. We take into account the preferences of each host and then apply this model when searching to calculate the probability of accepting a check-in request, taking into account all available information about the trip, guest and the current status of the owner. Some of them do not have clear preferences.
And here is another example of how a change in AUI can have a rather big impact on performance. We used this function several years ago, which allowed us to search by certain attributes: distance, price, etc. The option was quite popular. In fact, about 10% of the requests were made at the rental price. The problem was that price was the only attribute. Oddly enough, the cheapest position could look very attractive to the guest, but he did not realize that, most likely, the owner would reject him. Thus, we abandoned this feature and added conversion from a control experiment.
At Airbnb, we test and evaluate every little change we make in the search interface, booking stream, or algorithms. It is important to have a set of tools that would allow us to quickly get an idea about the work of experimental functions, about productivity. Therefore, we created a number of tools, for example, for offline testing, which we use before introducing a new algorithm. We also have facilities for online evaluation. Sometimes we conduct experiments at the market level.
S x S
Side by side, one of the most important tools. It allows engineers to get quick answers to questions like: “What percentage of requests will this affect”, or “How will a change in my ranking change the distribution of prices on this page”, or “Give me some examples of requests that give different results in this experiment on ranking. " An engineer can also get answers to traffic questions. For example, "How do my changes affect a top or less popular request." At night, we start processing such requests from developers, analyzing examples of real requests. The result is a list of queries with different probability distributions. After this, you can make comparisons, we simply indicate what exactly we want to compare, ranking experiments, rating options, sample requests, then,
We compare different result sets using a special function that is based on the Kendal's tau rank correlation method. This is a very simple method by which the number of pairs of results that change positions is counted. We made some changes to the algorithm. For example, we modified it to work with top queries, since the user sees only top positions on the site. We have already received some statistics. For example, a change in ranking affects more than 30% of queries.
An engineer can also delve into data mining and analysis. If you want to know why the results are ranked in a certain way in your control group, then you can view the values of all signals, all estimates obtained as a result of ranking, and so on. Not all data can be displayed, as it may contain confidential user information, but this information is very important for making quick changes.
The basic rule here is: always conduct experiments on real users. Thus, we can implement changes in the user interface or modify the ranking algorithm, and then compare the conversion in different groups of users.
I also want to talk about one problem that makes interaction on the site very difficult.
Let's take an example when we have two positions in one city. Both options have approximately the same price and many reviews. The first position is higher because it has a higher percentage of guests, 90%, while the second only 50%. Let's assume that we are conducting an experiment on selecting a host. Apparently, the second owner prefers couples. If this is not a couple, he simply rejects the candidacy. We are developing the following experiment: we say that the group will see the usual position rating, but if the number of guests in the request is equal to two, then the second result will go to the top, as the guest confirmation rating will be 100%. The position will begin to receive orders, the host calendar will begin to fill up, the result will be less often displayed for the control group of people. Thus,
There are completely opposite examples. The fact that this is an experiment can weaken your control over the situation. Certain aspects may be underestimated. The problem is that you can isolate guests from each other, however they still interact in the market. And more importantly, they influence their requests for its development. For example, if a guest book accommodation on a certain date, this means that another user will not be able to book it.
This is a big problem for us. Of course, this is not a problem for a product like Google Search, because web results do not disappear when people click on them. However, everything is much more complicated when you deal with locations. One solution is to separate from the market as much as possible. If you conduct an experiment in Boston, for example, then this should not affect the market in Chicago. We can create control groups within these departments. Now these users are not at the local market level, but entire geographic areas. And we can run an experiment for all users in a certain area, and then compare the results with the data for the control region.
The problem here is that it is very difficult for Airbnb to find markets that we can compare to each other. Firstly, the tourism business has a pronounced seasonality. And in different markets, high seasons fall at different times of the year. Thus, the productivity and profitability of each region will vary depending on factors that are difficult to take into account. In addition, Airbnb continues to grow very rapidly, and some markets have not yet stabilized, and becoming larger they completely change the picture.
I talked about how we help our guests find attractive positions, how we display the availability of positions and evaluate the changes we have made. In the end, I’ll talk a little about some of the problems that we recently encountered. If you go to our website, you will see a fairly clear message: “Give us the date and place of the trip, and you will get accommodation.” When you go to Airbnb you should clearly know where you want to go. But that should not be so. We want to learn how to produce suitable results for many queries. For example, “What if I don’t even know where I want to go, but I certainly know what I want to do. Can you pick up an amazing trip for me, taking into account what you know about me? ”To help the user with such requests, we need to learn to“ think ”wider and look“ deeper ”.
For example, on the left we have a very specific request. Location location - Venice, this is a view of the canal. Or “I live in San Francisco and I want to spend no more than two hours on the road. Find me the coolest place for the weekend so I can live in the house myself. ”
You will discover a host of amazing travel opportunities, even if you are not trying to find them. Just open the app and see tips and personalized offers.
On Airbnb, each position is unique. Helping our guests navigate all the functions of the portal is one of the priority tasks. So we don’t want to stun you with the thousands of offers available. In return, we want to show only a few perfectly suited to you. And, of course, we try to ensure ease of use of the service so that booking does not require any effort.
Thanks for attention!