Russian AI Cup 2014: winner strategy

    Continuing the good tradition of “revealing the secrets” of the winners of the annual Russian AI Cup contest from Mail.Ru, I present to everyone who is interested in this article. I will not describe the mechanics of the game world and other rules, if suddenly there are those who are interested in this article but do not know the rules, they will be able to find them on the official page of the championship.





    This year, unlike the past, there was a task “with physics”, and not a step-by-step task, which determined my increased interest in the event. In general, I want to immediately say right away that by physics I do not mean using phys2d or some other physical engine, but rather a certain set of properties of the game world. The first important property is continuity in time and space: small displacements of objects or a delay in the control action also slightly change the result. The second property is the sufficient complexity of the laws of motion of objects, so that direct modeling of the world with absolute accuracy is impossible (at least in a reasonable amount of time). The impossibility of accurate modeling forces us to do what is being done in real physical science - to determine which phenomena are the most important in the case under consideration and build an approximate model of the system. Regarding the task of the competition, an important thing in modeling was a fairly accurate prediction of the free movement of the puck and the hockey player, as well as the rebound of the puck from the sides (not in the goal area). It would also be useful to have a prediction of a rebound from the sides of the hockey player and the puck from the goalkeeper. I want to immediately draw attention to the words "fairly accurate" - to achieve an absolute, bitwise coincidence of the results of the prediction with the fact that the game engine produces no sense. And those people who did this were wasting their time. For example, in the final, I did not take into account the change in acceleration and turning speed during a fall in endurance during the motion prediction, only its value at the current moment was taken into account. important in modeling was a fairly accurate prediction of the free movement of the puck and the hockey player, as well as the rebound of the puck from the sides (not in the goal area). It would also be useful to have a prediction of a rebound from the sides of the hockey player and the puck from the goalkeeper. I want to immediately draw attention to the words "fairly accurate" - to achieve an absolute, bitwise coincidence of the results of the prediction with the fact that the game engine produces no sense. And those people who did this were wasting their time. For example, in the final, I did not take into account the change in acceleration and turning speed during a fall in endurance during the motion prediction, only its value at the current moment was taken into account. important in modeling was a fairly accurate prediction of the free movement of the puck and the hockey player, as well as the rebound of the puck from the sides (not in the goal area). It would also be useful to have a prediction of a rebound from the sides of the hockey player and the puck from the goalkeeper. I want to immediately draw attention to the words "fairly accurate" - to achieve an absolute, bitwise coincidence of the results of the prediction with the fact that the game engine produces no sense. And those people who did this were wasting their time. For example, in the final, I did not take into account the change in acceleration and turning speed during a fall in endurance during the motion prediction, only its value at the current moment was taken into account. It would also be useful to have a prediction of a rebound from the sides of the hockey player and the puck from the goalkeeper. I want to immediately draw attention to the words "fairly accurate" - to achieve an absolute, bitwise coincidence of the results of the prediction with the fact that the game engine produces no sense. And those people who did this were wasting their time. For example, in the final, I did not take into account the change in acceleration and turning speed during a fall in endurance during the motion prediction, only its value at the current moment was taken into account. It would also be useful to have a prediction of a rebound from the sides of the hockey player and the puck from the goalkeeper. I want to immediately draw attention to the words "fairly accurate" - to achieve an absolute, bitwise coincidence of the results of the prediction with the fact that the game engine produces no sense. And those people who did this were wasting their time. For example, in the final, I did not take into account the change in acceleration and turning speed during a fall in endurance during the motion prediction, only its value at the current moment was taken into account.

    I also want to say about the accidents, which this year was noticeably more than in the past. Randomness adds complexity to accurate prediction (making it fundamentally impossible), but this is a bad, cheating way. Real “deep” physics allows us to build more and more accurate models (at the cost of computational complexity or more advanced algorithms), while randomness does not give this depth. And it affects the continuity of processes in a negative way. From a practical point of view, the additional scatter of match results only complicates testing. This year's championship will be the first in which, in general, I did not select the optimal coefficients, because even long half-hour matches did not give a clear understanding of the strength of the strategies. But enough with digressions, I pass to the chronological description of the course of the competition.

    Beta test


    I spent most of the beta week on game physics reverse. Moreover, for me, a physicist by education, it is much easier to get the required formulas using experiments, rather than decompiling local-runner. All single contacts (shots), except for the collision of two hockey players (which I, in principle, did not deal with) and the frontal puck of the puck against the goalkeeper (in which someone participated in hidden rotation), I learned to predict with precision floats. There were enough strange and tricky moments in the device of the world, for example, the device of the bar (section 25 units long with a tenfold reduced rebound), or the push-out wall for hockey players in the goal (and without zeroing speed), or the position of the puck held by the hockey player (with speed compensation without taking into account friction ),

    The second important thing that I managed to do during the beta test is calculating the range of angles at which the goal hits. I made this prediction from purely geometric considerations, without any simulation of the flight of the puck. As the first angle, I took the direction to the far corner of the goal, after which I counted the flight time of the puck in this direction. Then I found the goalkeeper position for this time and counted the angle at which the puck will fly by touching the goalkeeper in this position. Then he recounted the flight time in a new direction and repeated the process several times. Here it is worth saying that an annoying bug crept into this calculation, which led to an overestimated angle estimate. And although I, watching games, repeatedly noticed strange blows and suspected a bug, before the finale I did not find it. However,

    Key algorithm


    After finishing the beta test, I completed experiments with physics and took up, in fact, strategy. Unlike tanks, where I, first of all, made the defense, this year I started from an attack. As a result, I didn’t make a normal defense, she remained a “set of crutches”, although she worked quite well. The key to the attacking part of the strategy (and then to all other parts of it) was the algorithm for reaching the target at a certain angle, which I will dwell on in more detail.

    We first consider the method of fastest reaching a certain point (excluding walls and other hockey players) with zero initial speed. A nonzero initial velocity leads only to a fixed shift of the region of reachable points by each time step, without changing the configuration of the region itself. Quite cunning geometric considerations (I will not give the considerations themselves due to their complexity and lack of mathematical rigor) lead to the following set of “optimal” trajectories.
    • Turn left / right with acceleration forward / backward by an angle less than 90 °, then continue driving without turning without changing the acceleration.
    • Turn left / right with acceleration backward / forward by an angle less than 90 °, then reverse acceleration with continued rotation for the next 90 °, then continue to move with acceleration, but without rotation.

    Due to the lack of rigor, the real optimal trajectories may slightly differ from these, but this is not important (see the discussion above on sufficient accuracy). Thus, the trajectories are parameterized with a flag such as acceleration in the main part of the trajectory (forward / backward) and the direction of rotation (left / right), as well as the end time of the turn. If this time is longer than the turn time by 90 °, then at the beginning of the motion path there is a section with the reverse direction of acceleration lasting [turn end time] - [turn time 90 °].

    Similar geometric considerations show that if you need not only to reach a point, but also to exit at a certain angle, then the "optimal" trajectory at a sufficiently large distance to the target looks like two "optimal" trajectories connected by their straight sections for the case of a simple point. That is, a turn with a possible reverse at the beginning, then movement without a turn, then a final turn with a possible reverse at the end. The trajectory parameters will be three flags - two directions of rotation and the type of acceleration in the middle, as well as two durations of rotation (not counting the total duration of the path) - at the beginning and end. According to these parameters, optimization is also necessary.

    Since there are at least two parameters, then a simple enumeration (as was sufficient in the same tanks) will be too slow (quadratic complexity) and more efficient multivariate optimization algorithms must be involved. I, as a “connoisseur” of genetic algorithms, used them, although this class of algorithms is not particularly effective and if there was time, then it would be worth experimenting with others, for example, simulating annealing. More specifically, the algorithm was as follows. To begin with, we search with a certain step among the trajectories of the fastest point achievement (without a second turn) and on these trajectories we calculate points (also with a certain step) for scoring a goal with a pass and hit with the maximum swing. Points per goal were considered as the probability of falling into the desired range of angles, taking into account the specified normal distribution at impact, times the exponential decay coefficient over time. Next, we select several trajectories with the maximum number of points, add “mutated” versions of these trajectories (a nonzero final reversal appears), re-evaluate the probability of clogging, and not with any step, but next to the best strike moment found earlier, and again we select some of the best. We repeat all this several times with the ever decreasing power of "mutations."

    The probability of hitting the goal without a final turnaround is almost always close to 0, however, the normal distribution has sufficiently large “tails”, so at the first stage of the search there were impact points, and in the process of genetic optimization a method of reaching these points was chosen. Moreover, a significant role was played by the bug with overestimating the sector, mentioned above, and at the same time as editing it, we had to add artificial “tails” to the function of estimating the probability of a hit before the final.

    Another important characteristic of the algorithm was the optimization, which was extended over time, which I only thought about in tanks, but this year I implemented it from the very beginning. In addition to the trajectories from the enumeration, the optimal trajectory from the previous step is added to the general pool with the correspondingly adjusted times. This adds the following property to the algorithm: it’s not scary that the current step has not found an optimal path; in the next step, the optimization will continue from where it ended. As a result, this algorithm became the cornerstone of the whole strategy, the framework on which everything else was built.

    First version


    The first version of the strategy, which was ready a week before the first round, in addition to the main algorithm, included the following elements (in chronological order).
    • Prediction of possible positions of opponents, based on the code for the initial enumeration of trajectories. The possible positions of the opponents at every step, as well as the approximate center of the club sector, were recorded on a square grid. Samples were made on this map, with fairly complicated filtering (greatly simplified later), and when optimizing their paths, fines were given for being close to enemies.
    • The initial search was done not only for the puck owner, but also for his other hockey players. The achieved positions were evaluated taking into account the time the enemy reached this and its neighboring (on the map) positions. Several positions with the highest rating became possible goals of the pass. When calculating the hit points for the owner of the puck, not only shots towards the goal, but also towards the found goals were moved. If, as a result of optimization, a pass option was chosen, then the recipient began to move towards the target position.
    • In case the puck is in free flight, we accurately calculate its position at each moment in time. We carry out full optimization for all our hockey players, only in the function of the score we consider the selection of the puck, or knocking it out if there is a threat to the goal. This part of the code also gives the correct selection of passes.
    • In addition to the selection and knocking out for a free puck, we also consider a blow to someone else's goal, exactly the same as the case of owning the puck. Now it becomes possible to score goals with one hit, although only with a rare set of circumstances.
    • Following the point in front of the goalkeeper was used as a crutch for defense if the puck was not at the enemy’s side and to the cleverly calculated point between the goal and the puck owner otherwise. Following was implemented in the same cycle of the initial search, and without taking into account the speed, so the hockey player constantly flew past and, as a result, dangled back and forth near the goal. This sequence algorithm was assigned to hockey players unoccupied in other algorithms, therefore there was usually one active striker with a puck or going to intercept a free-flying and one defender at the goal. If the puck is on the opponent, then both hockey players tried to move to a secret point.


    First round


    Testing in a system of competitions with real opponents showed that, firstly, the strategy was too slow and I had to quickly correct the parameters of the algorithms, and secondly, the algorithm for intercepting enemies with the puck was no good. As a result, for the second version of the strategy, I made it so that only one hockey player goes to the secret point, and the second hangs around the point near the goal as usual. In the 3rd version, instead of following to the secret point, I used a simple algorithm of the QuickStartGuy type, albeit with a lead, and also added a check of the danger of intercepting the puck during a pass or shot on goal in the shot assessment function. However, a serious bug crept into the new code and had to roll back. In the 5th version, in addition to fixing bugs, I made a serious processing of the code in order to get rid of heavy trigonometry inside the loops.

    The 6th version (and its corrected version is the 7th, with the stickiness removed on the backswing) is a new attempt to rethink the methods of protection. In conditions when the existing algorithm is no good, and the universal one is not invented, I implemented a new crutch, although of better quality. Now, my standard punch optimization algorithm was chasing the enemy with the puck, however, without taking into account the influence of opponents, and the puck trajectory found was processed by a slightly modified algorithm for intercepting a free-flying puck during optimization for its hockey players. A nice bonus was that the strategy learned to knock the puck right into the enemy's net, the scoring algorithm from the summer proved itself here.

    The 8th version, playing in the first part of the first round, differed minimally from the 7th, but one of the changes turned out to be decisive. By removing the time damping when calculating the probability of the puck intercepting by the enemy, I made sure that my hockey players stopped “blunting” in front of the opponent’s goal, expecting something unclear. In the second part of the first round, I added an assessment of passes with a rebound from the horizontal sides. For this, the target point was reflected from the desired side, taking into account the rebound coefficient. The 8th and 9th versions of the strategy gave me 6th place in the first round and, in general, showed that I was on the right track.

    Second round


    In the 10th version, I was engaged in adapting the strategy to changing the rules in the second round and, at the same time, in the final (though without taking into account changes in stamina during the prediction). But in the 11th version was added what ultimately brought me victory in the competition.

    The first fundamental addition is one-touch scoring. Moreover, the implementation of all this is quite simple, there are no multidimensional optimizations for several hockey players at once. I have already mentioned that for my own hockey players who did not own the puck, I drove off the first stage of the trajectory optimization algorithm - the initial enumeration. Now I’ve gone further: for “non-core” hockey players I start full optimization, as if they are the puck owners, and for the optimal trajectory I put the puck’s position at the moment of hitting the goal list. Passing to such a goal position is valued just like a simple goal (and even more in future versions), so one-touch goals from a rare set of circumstances have become mainstream.

    The second important improvement of the 11th version is the algorithm of following to the target point. Now, the calculation of the assessment of the trajectory does not include the current position, but the limiting position of the hockey player after a complete stop, plus I added orientation accounting and my defenders stopped meeting the puck backwards. Since there were three in the second round of hockey players, in addition to the “main” and the defender, an additional player was formed, to whom I prescribed following to a point just above or below the middle of the field (depending on the position of the enemies), from which it’s convenient to exit to the enemy goal.

    The final


    In the 12th version, I quickly implemented the substitution logic for the final: QuickStartGuy-shaped follow-up to the “out of game” substitution area and a slightly modified algorithm for following to the substitution point during the game. And then I sat down to implement various small ideas that were formed in my process of the championship. I will list them in chronological order.
    • He added a fine for transferring the puck to a place from which the enemy could knock it right at my goal.
    • He added the logic of the game in overtime without goalkeepers: he corrected the assessment of the hit sector and slightly pushed the defender deep into the goal. When updating the code for calculating angles, I noticed the aforementioned bug with an overestimated hit sector and fixed it.
    • I implemented normal processing of the swing state: earlier I just performed a swing for a memorized number of ticks, now I am exhaustively selecting the optimal moment to stop the swing.
    • Implemented accounting for the probability of interception by each individual hockey player of the opponent, and not all at once, as before.
    • In the process of reaching the target position, he began to strive to keep his orientation not to the final direction, but to a specific point. This solved the problem with the hockey player sliding for a long time to the target point at which there were holes in the defense, and also removed the sticks when the defender could not decide which way he should turn.
    • He made a more accurate than a rough reachability map, an assessment of the sectors of the interception of the puck in the first couple of dozen steps.
    • Added one-touch protection from goals - in the case of a free-flying puck, running the optimization algorithm behind the enemy and updating the predicted position of the puck. This algorithm added a negative side effect - it would seem that a completely safe puck trajectory was considered dangerous and my hockey players began to try to punch the puck, instead of just picking it up.

    The change in the strength of the game from each of these improvements separately was not noticeable, however, together they gave a double gap in goals of the 13th version from the 12th in long matches. All these improvements in the strategy of the 13th version played in the first part of the final, however I (and not only me) noticed a fairly large number of goals in my own net because of a side effect. And since my strategy was on par with my main competitor, I decided that even such a small number of goals can determine the winner. Therefore, in the 14th version, which played in the second part of the final, I added a penalty for trying to knock the puck into my own net. Either this change, or changes in the competitor’s strategy (I am more inclined towards the second version), allowed my strategy to go ahead, and I became the two-time winner of the Russian AI Cup championship.

    Afterword


    The separation of my strategy from rivals this year was much more serious than it was in tanks. I confidently held the first place in the sandbox until it closed, which in 2012 I could not do. Then, as a result of random fluctuations, I did not even get into the winners of the sandbox. True, I still did not get the extra prize, but not because of the weakness of the strategy, but thanks to the new rules.

    You can watch fights in my profile .
    The source code for the strategy is available here .
    My old article for 2012 (tanks).

    Also popular now: