Analysis of data in sports: the interaction of scientists, clubs and federations. Lecture in Yandex

    We conduct events not only on topics that we are engaged in. In February, we brought together specialists in the use of machine learning in sports. It is amazing how many processes connect these two areas - data analysis and sports - and how many unsolved problems arise at the junction between them. Here is a report by Dmitry Dagaev, Deputy Vice-Rector of the Higher School of Economics.

    - Today I will try to briefly talk about the problems that are already being solved by analyzing data in sports. We will see that the interaction of agents is the key factor that allows us to solve these problems.

    In general, data circulation in the sports industry is organized as follows. Firstly, there are sports federations and leagues that determine the rules of the game, according to which sports competitions, sports economics function.

    The problem is that these leagues, competitions, this economy are a very complex construction. And while making optimal decisions, of course, I want all of them to be optimal, but the federations run into the need for accurate analysis. This is not always possible. Often, in order to get an accurate answer that allows you to maximize a particular value, you need to conduct a great computer simulation or some kind of regression analysis. Of course, the federation cannot always do this on its own.

    Rules are defined, and clubs or individual athletes are forced to live by these rules. The next arrow also characterizes a certain range of problems. We played some match or had a chess game. In theory, a huge amount of data can be extracted from a match. Until recently, data mining ran into very serious problems. How to do it? The development of modern computer technology has allowed, for example, in such strategic games as chess or go, to beat the most powerful athletes. And in games such as the simplest versions of checkers or poker, to find an actually accurate solution.

    Extracting data in tasks that require image analysis is a rather expensive story. Therefore, in the market over the past 15–20 years, companies began to appear in the field of video footage of football and hockey matches and then successfully sell this data to clubs, federations or scientists.

    Another problem is the need to analyze the data. The academic community comes to the rescue here. Very often, research departments exist either in sports clubs themselves, or in companies that extract data, or in universities - as separate research centers. And they give recommendations based on the analysis of the data. So in the general case this system is arranged.

    Why analyze the data?

    There are two global goals. This allows teams and athletes to increase the probability of winning in a separate match. On this slide is Jens Lehmann. Perhaps the most famous example is when, in 2006, the cheat sheet he pulled out of his leggings allowed him to repel two penalties from the Argentine national team and advance to the next round of the World Cup.

    The second generalized task is to increase financial results. Matters related to increasing the sale of broadcast rights for matches. This requires an accurate analysis of consumer preferences, pricing. Here the data allows you to find the best answers.

    In general, it may be interesting for clubs to analyze their opponent’s strategy in order to find the optimal answer.

    In the early 2000s, Ignacio Palacios-Worth’s famous article “Professionals Play Minimax” appeared, in which it turned out that athletes who shoot through penalties behave very much like the theory predicts, and this allows clubs to analyze data, find optimal answers to the expected strategy of a football player. Including if he deviates from the equilibrium strategy and tries to use some other, he can successfully play against it.

    Transfers, the search for undervalued athletes are also a very important task, in which many breakthroughs are now planned. Each club can formulate its own separate request for what kind of player he needs. It is clear that there are a large number of athletes on the market, each of them has some kind of transfer value, for each you can collect a large amount of statistical data on his performances in the recent past, but the problem is that looking at the generalized statistics is not enough, you need to match these data with the needs of a particular club, so each club has to look, including those soccer players or hockey players who are optimally suited to its task.

    Assessing the effectiveness of an athlete, how to arrange a contract for this athlete, how to find the optimal part of the bonuses that he needs to pay, how to make sure that he puts maximum effort into each particular match. This problem can be solved by evaluating the effectiveness of a separate set of actions that he performs during the match. This is also an important task in which clubs are now interested.

    From the point of view of federations or leagues, the tasks are slightly different. An important task is the design of competitions, how to make the league the most competitive. Because the higher the competition, the league is more attractive to the audience.

    How to regulate the league, how to establish optimal rules? What will the limits on legionnaires lead to? I would like to not only say that this will allow our team to successfully play, but I want to try to simulate the situation and provide a more or less accurate answer based on predictions, simulations of this complex system, which in its complexity can easily compare with the economy of a country.

    Another problem that has been actively being addressed lately is the search for match-fixing, the illegal betting market is growing at a frantic pace. If the betting market as a whole is already valued at hundreds of billions of dollars, then the illegal betting market, even if it occupies some share of it, still represents a large volume.

    The fight against doping, today this is especially true. As it is possible, not being able to catch a particular chemical element, nevertheless make a prediction that the athlete uses doping - this analysis also helps.

    Recently, a large number of investors have come to this market. The cost of solutions increases.

    Here are the volumes of the entire football market, the cost of the most expensive athletes in history. It can be seen that every approximately 20 years the cost of the most expensive transfer increased by about 10 times. Even if you give a discount on inflation and a change in the real value of the pound, all the same, these growth rates are significantly higher than the growth rates of the world economy and some other industries. There is an expectation that this trend will not stop. Rumor has it that right now in the contracts of some players are prescribed compensation, comparable to a billion dollars. Let's see what will happen next.

    The demand for accurate solutions that will help to avoid mistakes, including those measured with specific money, will increase. Therefore, we have something to do.

    But not so simple. There are several significant issues. The main thing is to underestimate the importance of making accurate decisions, decisions that are based not on intuition, but on accurate calculation.

    According to some experts, here we are behind Europe and the USA by about 10 years in the pace of implementation of sports analytics.

    Now any western club has a staff of analysts who not only collect data, but also analyze them in detail.

    This problem creates a demand problem. This impedes the development of this market in Russia.

    This is not the only problem. There is a supply problem, a lack of qualified personnel. Unfortunately, there are no specialized programs that provide education in the field of sports analytics, data analysis in sports. There are general programs for data analysis, and lately there are a lot of new, interesting and even free programs, but unfortunately, the market is not saturated yet, and this limits its growth. I hope everything is ahead.

    The third problem, which is now being most successfully solved, is the high cost of data collection. To get high-quality complete data, you have to come to each league match, shoot it on professional video equipment. The last 10-15 years there have been significant shifts and breakthroughs.

    I’ll tell you about specific tasks.

    The first example is the illegal betting market.

    It all started around 2006, when Italian police discovered by wiretapping that a substantial part of the matches of the Italian championship is actually negotiable. Five clubs were significantly fined for participating in this disgrace, several clubs were lowered, sent to lower leagues, some clubs lost some points in the national championship. Juventus lost the title. Then it became clear that you need to somehow respond. It’s not always possible to grab the hand at the time of conspiracy. It is necessary to find suspicious matches according to indirect data.

    In 2009, UEFA and Sportradar launched the UEFA Betting Fraud Detection System. It monitors all bets on matches held under the auspices of UEFA in all major bookmakers in the world. The system at the moment for each match, when a bet that differs in size from the expected bet for such a match, when observations differ from the regular trend are detected, a signal is generated at that moment. And these signals are collected over a long period of time, and if it turns out that at the matches of the same team very often such signals come in, logical suspicions immediately arise.

    Here is the loudest case that has occurred in recent years. Since 2010, the champion of Albania, the football club Skenderbreu has been marked with more than 50 such strange signals. When the relevant authority within UEFA analyzed these signals, Skenderbrew was first suspended for one season from participating in European competitions, and now the issue of suspension of this club for more than 10 years is being discussed, as new data are revealed as a result of the investigation. Yesterday there was news that UEFA employees began to receive threats in connection with the investigation. Apparently, really something very suspicious. The UEFA president had to publicly defend employees who analyze data and find these strange elements.

    It's not about what was proven in the legal sense of giving bribes. These are just signals that show that something is very likely to be wrong.

    What else might sports federations be interested in? The Belgian Royal Football Association for a very long time, for almost 30 years, has not changed the format of its championship. As a result, sports clubs felt that they wanted something else. Some clubs wanted a stronger league; some wanted the league to expand. Then a request arose to change the format of the tournament.

    The officials wanted to minimize the number of insignificant matches. From their point of view, this was precisely what restrained the growth potential of the league and the sale of rights, and they ordered a number of researchers to compare the formats of several tournaments in order to minimize the most uninteresting matches, where both teams have already decided something or where no one is interested in playing to win.

    Goossens, Belon and Spiksma analyzed several formats proposed by the federation, and in practice a solution was implemented that they predicted that it would minimize the number of uninteresting matches. This is another example of the interaction of the academic community and the sports federation.

    Another challenge is how WADA deals with the fight against doping. Of course, chemical analyzes of the samples are carried out, but since 2009 WADA began to implement the athlete’s biological passport. Each time, when sports officers come to the athlete and take a urine or blood sample from him, the results of this analysis are recorded in a special biological passport. Since 2009, blood counts, hematological control have been recorded. Since 2014, there are also urine indicators, steroid control. And the dynamic data in this passport is analyzed for the fact that the dynamics are different between athletes who use doping and who do not use it. Without chemical detection of traces of doping, it turns out to make predictions about who is suspicious, and these athletes begin to pay attention, they are presented with more requirements,

    Go down to the club level. The loudest story is at the Midtjylland Sports Club. The club’s sports director said, “We have to go from making decisions with our hearts to making decisions with our brains.”

    The football club actually owns a whole department of analysts who sit in London, and any strategic, managerial decision that the club makes is based on data analysis. Lacking a substantial budget, the club, nevertheless, finds players who are the most productive and successful in the strategic tasks of this club.

    In 2014–2015, for the first time in history, the club became the champion of Denmark, in the current season it is again in the first place, and the entire sports and analytical world is actively monitoring what innovations the club brings to decision-making, which help improve results.

    About the academic side of this market. Since I represent a university, this is the most interesting for me.

    The specialized journal Journal of Sports Analytics has now appeared. The volume of tasks that are interesting to solve exclusively to the academic community, even in isolation from the market, is very large.

    www.journals.elsevier. com / sport-management-review

    There are also about seven journals that publish research in data analysis. Magazines in the field of computer science, economics, management. And recently, the ratings of these magazines are starting to increase significantly.

    Scientific conferences. About 15 years ago, large conferences began to appear, sponsored by large corporations, universities with a worldwide reputation. I want to advertise the HSE and NES organized conference on the economics of football, which will be held in Moscow as part of the World Cup from July 9 to 11. We will be glad to see you as listeners, and if you have interesting works that can be presented there, also write to me , we will try to include them in the conference program.

    I hope that our communication will help to identify common points of contact. We hope that the analysis of data in sports will develop and correspond to the growth rate of the market that we have already seen.

    Also popular now: