Big data in hotel management: use cannot be ignored


    The season is approaching the long-awaited season of summer vacations, and many have already chosen for themselves the very desired tourist destination, which gave them strength to wade through the wilds of overtime and months for months. The treasured “dream journey”, which will be so pleasant to remember later in the autumn and winter evenings, is already very close.

    When choosing a vacation home, many will probably use The article suggests looking at “from the other side of the interface” through the eyes of those who manage the hotel and set prices for accommodation. More specifically, the tools of Analytics and the possibility of using the data to manage sales in a hotel are considered. As an example, there are cases for a mini-hotel located in Cambodia, which I had the honor and pleasure to manage.

    Why cambodia

    As a short introduction, I will try to explain why the hotel of my dreams ended up in Cambodia. The most important thing is the loyal business environment in this country. Today it is the only country in Asia in which a foreigner can legally register a business in his name, and at the same time, he can travel to Cambodia for an unlimited time on a business visa. The cost of ALL permits for a mini-hotel, for example, is about 400 US dollars per year (including a license from the Ministry of Commerce, a license from the local city hall, a tax patent, and an individual work permit in the country). When preparing the listed documents, there are no particular difficulties, and when inspecting the business by state bodies, the availability of permits is a necessary and sufficient condition to avoid any kind of extortion.

    Extra buns are all the delights of living in Asia. Affordable prices, a non-aggressive population, the sea and beautiful nature, a warm climate, including a fairly mild monsoon period, year-round fresh fruits, vegetables and seafood, a “simple life" that does not require heating, investment in winter or brand clothes and shoes, in apartment renovation, in expensive cars and other attributes of a “successful life”.

    There are also disadvantages: “fools and roads”, expensive electricity ($ 0.20 per kilowatt hour), the almost complete absence of medicine and other infrastructure issues (problems with the work of the police, fire services, educational system, public utilities, etc. ), problems with garbage (however, this is typical for many Asian countries, and during the “garbage wars”, and for European ones).

    There are several articles on the Habrtyts and tyts ), objectively, in my opinion, reflecting the state of affairs and living conditions in Cambodia, therefore, I will not further develop this topic.
    So, Cambodia, the resort town of Kep , the hotel Chateau Puss in Boots , 2019.

    Introductory remarks and limitations

    For sales, we currently use only and AirB & B. You can talk a lot about the advantages and disadvantages of these and other services, but in this case it is important that clients have come and come to us from these services, but not from others. Before Kep, my wife and I had a hotel in Sihanoukville, and even earlier, in Morjim, Goa, and there we had the same picture through sales channels. At AirB & B, analytics is still in its infancy, so only is considered. And here we have only one main lever of sales management - this is the price of a room per night.

    Of course, other factors influence sales. For example, a rating based on guest reviews. Rating statistics are present in the analytics of, and we will consider it below.

    Much depends on the tourist conditions of the place. Kep , for example, is a small village with an average developed resort infrastructure. For many, this is just a transit point on the border between Cambodia and Vietnam. However, the energy of the colonial French riviera, the sea and islands, mountains and caves, pagodas and national parks are doing their job, and the constant flow of tourists in the season quite confidently fills the local hotels.

    An important point affecting sales is the hotel’s concept and “chips”, which help the client make the right choice and stimulate an intuitive “recognition” of the place where he would be comfortable. This question is related to the goal-setting, mission and worldview of the business owner and is beyond the scope of this article.

    In addition, several important assumptions must be made to understand the limitations of the study:

    • it will be about a private mini-hotel (among the definitions that I found indicate that the mini-hotel can have up to 15 rooms), in which there are no corporate procedures, and everything is simplified to the limit in order to reduce overhead costs; therefore, all operational activities are concentrated in the hands of owners without the participation of any structural units; for example, we have only a cleaner working in the hotel, my wife and I do the rest ourselves with outsourcing only complex repair and construction works; if you need to leave for 1-2 days, that is, an agreement with the incoming administrator;
    • the structure of the price per room, expenses and additional earning opportunities (bar and restaurant, rental of bicycles and motorbikes, sale of tickets and excursions, etc.) is not considered
    • The general approach to hotel management is not considered; however, this is an interesting framework about which I wrote in a slightly different format ; if the topic is of interest, I will also make a post on the hub on the topic of management in a mini-hotel. Analytics Functionality Analytics was launched in 2016 as a tool to help hotel managers analyze bookings and sales statistics. The system supports the Russified interface, but, in my opinion, it is important to refer to the source so as not to distort the basic terminology. Analytics includes the following sections:

    • Analytics Dashboard aggregates data for a review of achieved indicators, including the number of nights booked by room category, room income (total amount paid by guests) and average daily rate (ADR), which is the income from the room divided by the number of nights paid. ; Analytics Dashboard also contains links to the main reports briefly discussed below;
    • Pace Report allows you to compare the sales volume on with the same periods of the previous year, as well as compare sales with aggregated data on your competitors;
    • Sales Statistics provides a cut of sales for any period of the last year;
    • Booker Insights provides detailed information about hotel guests, including the country, the device used for booking, and the purpose of the trip;
    • Bookwindow Information shows how early customers book their rooms;
    • Cancelation Characteristics contains information on the percentage of canceled bookings;
    • The Guest Review Score contains data regarding hotel guest reviews and hotel ratings on a 10-point scale by guests;
    • The Manage Competitive Set allows you to select up to ten hotels in your region to compare your own Key Performance Indicators (KPIs) with the KPIs of your closest competitors;
    • Genius Report shows the percentage of bookings made in accordance with the Genius program (frequent travelers discounts);
    • Ranking Dashboard demonstrates how effective hotel sales are when guests are looking for accommodation in a given region.

    For data analysis, date ranges of 7, 14, 30, 60, 90, or 365 days can be selected. In addition, there are additional features for analyzing data by comparing:

    • own results with last year indicators;
    • own results with indicators of a competitor group, including up to ten hotels, appointed by choice;
    • own results with indicators of the market, which includes all the facilities in the hotel location. Analytics Big Data Examples

    This section does not pretend to be any generalization, especially since the picture may change from month to month. These are just examples of using the built-in Analytics tools.

    For example, in Booker Insights you can see statistics on countries from which tourists book hotel rooms. National characteristics of tourists is a separate topic, which can be discussed for a very long time. Therefore, country statistics are also quite fascinating. Each country has its own preferences, and this affects the distribution of the target audience of the hotel. Although, sometimes there are unexpected statistical outliers. For example, in the midst of the tourist season, we got such a picture. Our hotel is highlighted in brighter color, and the market situation is paler. Analytics Data: Distribution of Hotel Guests by Country

    Tourists from Cambodia and France represent about 50% of the tourism market in Kep, however, at our hotel they made up only 15% and 14%, respectively. This can be explained by the conservatism of Cambodian tourists who like to stay in hotels managed by Cambodian owners. The same is explained by the low percentage of French tourists, many of whom speak poorly or do not speak English at all. Russian tourists also like it when the hotel staff speaks Russian, and this explains why they make up more than 10% of guests against 1.4% of the market. As for New Zealand (10% of bookings in our hotel against 0.6% of the market) and Swiss (8.7% of bookings in our hotel against 2.4% of the market) tourists, the higher percentage can be explained by a good price-quality ratio , as tourists from these countries are conservative in terms of avoiding unnecessary costs. The Booker Insights detailed report also contains information divided by country regarding the average daily room rate, average length of stay and frequency of cancellation. These data are important for predicting the behavior of tourists by country. For example, guests from Cambodia most often cancel their reservations.

    The following diagram from the Bookwindow Information section provides information on the distribution of the reservation window, i.e. how many days before arrival guests book rooms. Analytics Data: Distribution of the Reservation Window

    The large booking window provides more options in terms of determining the daily room rate. In addition, room rates should take into account local and global holidays in order to set holiday prices in advance. Statistics say that few guests book a room for more than 30 days. Moreover, about 70% of all reservations were made immediately before arrival. This is not very good, as the risk of the rooms remaining unfilled increases, and in addition, more careful adjustment of the daily room rate for the actual date is required.

    An important indicator affecting any hotel business is the percentage of cancellation of reservations, the data for which are available in the Cancelation Characteristics section (see diagram below). Here also at the top of each of the bands our hotel is highlighted in brighter color, and the market situation is paler. Analytics Data: Distribution Frequency Distribution Distribution

    Last minute cancellation usually causes stress, as it significantly reduces the booking window and increases the risk that the canceled room will not be sold out. Unfortunately, for the analyzed example, 34% of bookings were canceled, while the cancellation rate for the market in question is 28%. Most cancellations are due to the booking window for more than one month. It is difficult to develop an effective strategy to reduce the number of cancellations. People often change plans, or they may find that the offer of some other hotel is more attractive to them. We try to communicate with the guest as soon as we receive the reservation, but this strategy is also not always successful.

    The hotel business is highly dependent on the reputation that determines based on guest reviews. The rating is set in the range from 2.5 to 10 for the following hotel features: cleanliness, comfort, location, amenities, staff and value for money. The Guest Review Score section contains details of each of the reviews and also provides aggregated hotel rating values. The diagram shows data on the number of reviews received in each of the months, and the graph shows the final rating value for the results of each month. The results of our hotel (a brighter graph and a histogram) are compared with the average results of the ten closest competitors. Analytics Data: Hotel rating based on guest reviews supports the Genius Loyalty Program. Registered Genius users on receive discounts on reservations of 10% or more. To attract Genius travelers, the hotel must support this program. The problem for the hotel is that the price reduction occurs solely due to a decrease in own income. This means that the price for guests with Genius status is only 90% (sometimes even 85%) of the declared daily price of a room on On the other hand, many users participate in the Genius program, and these users appreciate it when the hotel supports the program. Thus, participating in the Genius program can increase the overall income of the hotel, even though the daily room rate is reduced. Important to remember, that the daily room rate should take into account the risk of a 10% or 15% reduction in room rates for Genius guests. Genius guests make up more than 50% of all customers, which demonstrates the effectiveness of hotel participation in the program. This information is available in the Genius Report section. Analytics Data: Ratio of Genius Bookings

    Integrated data on hotel activities is available in the Ranking Dashboard section, which presents a number of indicators that, according to, affect the hotel’s income.

    Data are given in comparison between our hotel and average market results:

    • Conversion is the percentage of page views of a hotel converted into a reservation (the ratio of the number of bookings to the number of page views of a hotel on;
    • Average Daily Rate (average price per night), the combined income received from the rooms sold divided by the number of rooms sold;
    • Cancelations show the percentage of all reservations that have been canceled;
    • Review Score (guest rating) is calculated using the ratings given by the hotel guests;
    • Property Page Score (hotel page rating) shows how full the hotel page is in terms of information and photos;
    • The Reply Score takes into account how quickly the hotel responds to guests.

    Taking into account the above six factors that may affect the hotel’s income, it makes sense to consider the respective dependencies. However, some indicators (cancellation of reservations, guest rating, hotel page rating, rating of answers) can only indirectly affect income. Therefore, it is impossible to find the relationship between the income of the hotel and indirect factors. Promising, from the point of view of the analysis of big data, are the percentage of conversion and the daily price of a room. In the next section, we will consider hypotheses related to the dependencies of income on conversion and daily price.

    Hypotheses for managing room price based on big data

    So, using Analytics, we have access to big data reflecting the state of sales at the hotel. I would like to understand how the use of these data can help in determining the optimal price per room.

    Economic science suggests that there are supply and demand curves, and therefore, some optimal price that allows you to extract maximum profit from the sale of a product or service. Errors of the first kind (raising the price above the optimal one) lead to the customers refusing to buy, and errors of the second kind (lowering the price below the optimal one) lead to lower profits, and it is not the fact that the number of sales increases.

    Thus, we put forward Hypothesis 1 (G1) : There is a relationship between the sales volume of S numbers and the cost of a room per night C.

    Formally, for each calendar day for each of the rooms this can be described by the following minimax criterion:
    S = max (C) ˄ f = 1, where S is the turnover from the sale of the room numerically equal to the cost of living in the room C = {Cmin ... Cmax} (value cost belongs to a certain range); f = {0; 1} - binary indicator of the sale of the number: f = 0 if the number is not sold and f = 1 if the number is sold.

    If there are several numbers of the same type, then not all numbers may be sold every day, in addition, the price Ci for the same number may change during the sales window, and the minimax criterion is:
    S = max ($ \ Sigma $ Ci) ˄ F = $ \ Sigma $fi, where Ci is the price of one room (the price of a room of the same category can change), fi = {0; 1} is a binary indicator of the sale of a number, F = {0..N} is the number of rooms sold in one category, the total number which is N.

    If the hotel has several categories of rooms, then for each of them the above criterion is applied, and the total turnover of the hotel is formed as the sum of sales of all categories of rooms, or everything can be reduced to a general formula if you increase the dimension by adding another index .

    Let us analyze the mutual dependence of sales and room rates (hypothesis G1) I will not give detailed economic data, I will show only the general result. To analyze the relationship between the two data series, we use the Pearson correlation coefficient, calculated as the ratio of the covariance to the product of standard deviations:

    For calculation, MS Excel is used, in which monthly hotel accounting is conducted. Therefore, the correlation coefficient is conveniently calculated on a monthly basis. It is recommended that the number of observations be at least 10 times the number of factors, and the number of days (observations) per month fits into this recommendation. We launched the hotel just before the New Year of 2019, therefore, as of June 2019, we have accumulated statistics for only 5 months (150 days of observation). There is discord for a month, and the values ​​of the correlation coefficient differ significantly, from 0.51 in March to 0.93 in February. So, in some months, hypothesis G1not confirmed, and the relationship between the cost of the number and sales does not exist. Nevertheless, for those months in which r> 0.75, we can talk about the presence of a dependence of one random variable on another, i.e. hypothesis G1 is confirmed. It is best to analyze on the entire data set, because if we have hundreds of times the number of observations that exceed the number of factors, then we are approaching the law of large numbers. For five months, hypothesis G1 was also confirmed (r = 0.80). Below are the values ​​of the correlation coefficient for each of the past months of the current year, as well as the integral value for 5 months. Let me remind you that we are investigating the dependence of the daily sales volume on the average room price for a given day.

    Values ​​of the correlation coefficient r (S, C)

    Obviously, the sales volume depends on the number of sold numbers. However, a correlation between the number of rooms sold per day and the average daily price of a room was not found (r = 0.51 for the entire data sample).

    MS Excel can also construct a scatter chart, add a graph and a linear regression equation to it, and determine the approximation confidence value R 2 for the regression . Regression can give reliable results if for it R 2 > 0.8. For a complete sample of this reliable regression could not be obtained, since the reliability of the approximation was R 2 = 0.64. However, this is possible for those months when r> 0.9. For example, for February we got R2 = 0.86. February is marked by the most significant sales volume of the year due to the Chinese New Year, which lasts more than a week and ensures full occupancy of the hotel at high holiday prices.

    Linear regression does not make sense in terms of optimization, since it says that the higher the price, the higher the profit. However, the price should be within a reasonable range comparable to the price of the nearest competitors.

    From the point of view of sales management, the most critical is the area in which daily sales were less than 30 cu, and it is especially critical when sales were 0 cu However, our statistics do not provide an answer to the question of what value of the daily price is optimal, since with a price in the range of 12 to 20 cu sales ranged from 0 to 6 numbers per day, and this did not depend on other calendar factors (for example, day of the week or approaching holidays).

    Another assumption is that the more tourists are looking for accommodation in your area and the more tourists are browsing your hotel page, the more reservations you will receive. Analytics provides such data. For example, in the chart below, the search results for the city of Kep (Cambodia) by day. The conversion is 132/79 377 = 0.16%, that is, for 10,000 tourists looking for housing, we get 16 bookings. Analytics Data: Number of Search Requests by Region

    Formulate Hypothesis 2 (H2) : There is a relationship between the sales volume of S numbers and the number of search queries per day R.

    However, the correlation coefficients obtained both for a full sample of data for 5 months and monthly did not exceed 0.5, which indicates the absence of a relationship between two random variables. This applies both to the number of search queries in the region, and to the number of page views of a hotel on


    This article discusses the capabilities of Analytics tools designed to analyze big data related to hotel room sales. Based on the available information, two hypotheses were put forward. Hypothesis 1 (G1) was

    confirmed : There is a relationship between the sales volume of S numbers and the cost of a room per night C.

    To analyze the reliability of the hypothesis, the Pearson correlation coefficient r (S, C) was determined for the data for the first five months of 2019 (r = 0.80) and monthly (maximum value r = 0.93 in February), which indicates the presence of a relationship between the two data series. Regression is linear (the higher the price, the higher the profit), which makes it impossible to optimize the value of the daily room rate. Nevertheless, the price per day should be in a reasonable range comparable to the price of the nearest competitors. It was also not possible to determine the optimal value of the room number by a numerical method, based on the scattering diagram.

    Not confirmed Hypothesis 2 (H2) : There is a relationship between the sales volume of S numbers and the number of searches per day R.

    Despite the availability of big data, at the moment it is not possible to formulate a sales management strategy based solely on statistical indicators. Perhaps these patterns depend on such energies that statistics are not dominated by. The wave theory of business is known , and, from my point of view, it makes sense. If we build a simple dependence of sales on the calendar date, then we will clearly see alternating peaks and dips. Thus, it is necessary to “catch the wave”, using, inter alia, experience and intuition.

    This article does not claim to be the ultimate truth, this is just my experience, which I wanted to share.

    And all I have to do is wish the readers maximum enjoyment from using booking services and unforgettable trips!

    Also popular now: