History of participation (and victory) in the Russian AI Cup 2018 - CodeBall

It was Wednesday, there was a usual boring meeting at work. The designer scratched his ear, and the tester buried his phone. A car started up outside the window, and I received a letter on the phone - the Russian AI Cup 2018 started . No one suspected anything around, and at that moment I knew exactly what I would do in the next month and a half.

Hello everyone, my name is Andrey Tokarev and I would like to share the experience of participating in the Russian AI Cup 2018 .



What is it?


Russian AI Cup - an annual competition in artificial intelligence, held since 2012. Here you need to write an algorithm that controls someone or something, and these someone or something compete with each other. This year it was necessary to control the robots playing football.

I already had some experience playing in such competitions. In particular, I participated in the Russian AI Cup 2016 (without a prize) and Mini AI Cup 2018 (2nd place).

Go


First of all, I created my own game world classes, objects, I took a 2-D and 3-D vector from past competitions and filled it all into the Git repository. Of course, you can use objects from the language pack, but there are no vectors there, they cannot be modified, and generally it is more convenient to use your classes. That is why I did not begin to rewrite the arena class, it suited me as it is, and there was no need to change it.

Simulation


Here we have the game world, and what can we do about it?

But it turns out that nothing, until we can simulate the game world, does not even determine whether the ball flies into an empty net. So we write off the simulation. The simulation code is not part of the language pack, it is provided in a language I don't know. But his C syntax is similar, so copy-paste + definition of necessary functions, and sim is 90% ready. Where it was necessary to rule with my hands, I tried to do it carefully, since then mistakes can be costly here and it will be difficult to catch them. I managed to make a couple of mistakes, but everything was done with a little blood.

It immediately became clear that if you use honest simulation (100 micro ticks), then this is not enough for anything, it is much more profitable to calculate 100 variants with one microtic than one variant with 100 microtics. I still left 2 mikrotik so that the discrepancy was not too large.

Strategy Basics


And so we have the game world and we can simulate its change. And what's next?
There are different approaches. When there are few possible moves and the depth is not very large, you can take brute force: to sort through all the moves, even the opponent's response moves, to take your moves again ... i.e. minimax When there are many moves you can artificially limit them, for example, you can take directions multiple to 15 degrees, jump on every 10th tick, and use the same minimax. But this approach seems to me to be suitable when the result is not too sensitive to small changes in the moves, but here a deviation of one degree of the direction of the robot will lead to large deviations after the collision.

At the other extreme, when we make moves without iteration, some heuristics. This approach can be viable, but to create a strong player purely heuristics is very difficult.

But the combination of two methods looks promising: you can first move in a random direction, and then finish the game with a heuristic that can run up to the ball and jump at the right moment. The same heuristics can also be used in combination to predict the opponent's moves. Earlier, I used something similar in the competition and this method has more or less proved itself badly.

And so we write heuristics (in RAIK-ovsky jargon, smart guy or just smart)!

Since as soon as possible I wanted to see the result, the smartgun was written in a hurry and turned out to be quite stupid (even the code is embarrassing). I simply calculated the time after which the robot could catch the ball based on the current speed of the ball and the maximum speed of the robot, and ran to the point where the ball would be at that time (collisions were not taken into account). He did not know how to jump and did not even take into account the height of the ball, could easily run under the ball, and if the ball moved away too quickly, then he generally ran in the opposite direction. Since the smart did not know how to jump, at first I forced him to jump at a fixed distance from the ball, and a little later the choice of the moment of the jump lay on the shoulders of the great random house! The lack of a rational choice of a jump turned out to be a big disadvantage, but more on that later. But in general, pure smart did not look bad, and sometimes even scored goals,

Next, we need an evaluation function (PF).

The initial PF looked something like this: (1000 - tick) * goal + ball.z + some function of the relative position of the robot and the ball so that it runs up to the ball, even if it cannot reach it. Here, the goal can be -1.0.1, depending on whether there is a goal and to whom, and tick is a tick from the start of the simulation at which the goal occurred. The latter is necessary so that the robot does not constantly postpone the goal.

Now we force the robot to run in a random direction with a random number of ticks, then we transfer control to the smart that leads it to the ball, at the random moment it jumps and hits the ball over the top. Moreover, it is better not to run to the center of the ball, but with a random shift of not a great distance so that the robot can beat at an angle.

My simulation lasted for a fixed time - 2 seconds, and at the end the OF was called. This scenario was repeated several times for each robot and the best one was selected based on the assessment.

So far, this strategy has a serious drawback: it does not have memory - on every tick everything is calculated from scratch, this means the strategy can see a goal on one tick, and not find it on the next one. This is not the case, you need to fix it - save the best found option for each robot and reuse it in the next tick. Also now the robots are in the know of each other, for example, if the first one is going to hit the ball, then the second one does not run for the ball, but tries to accept the pass.

Goalkeeper


We need a goalkeeper. My goalkeeper was basically different from the attacker in that he activated when the ball approached some distance to the goal, otherwise it simply returned to its base point.

Total


Now we have not a bad basic strategy, which knows all the most necessary and on which it can be built in the future.

Perhaps everything described above seems simple and logical, but to be honest at the beginning of the competition there was not such a clear picture of the strategy that appeared towards the end, and it took two weeks to implement this, and this is already 1/3 of the competition.

Something about testing


Over time, the Grails, which double the strength of the game, will occur less and less, and we will have to choose changes that give + 10-20% increase in goals. And it turns out that such small increments are not so easy to identify. For a thorough result, you need hundreds of goals scored, and at a frequency of goals once a minute, this means many, many hours of playing time. For this reason, I almost did not test the strategies on the server, but drove long local tests against previous versions. But even locally testing any change by half a day would not be very convenient. According to this, I applied a little “trick” - I tested the trimmed strategies. If, for example, on the server, I went through 50 options for the robot, then locally only 10, it allowed us to drive tests for a tolerable time.

Improvements


Next, I will describe the main improvements, and their assessment in the following form: by how many percent does the new version score more goals than it receives from the old one. Those. if for example a new one beats the old one with a score of 120: 100, this is + 20%, if it scores 2 times more, then it is + 100%.

- Your goalkeeper must beat goals. If it doesn’t work, we’ll give it more time, increase the number of options up to x10. + 15%

- Sometimes, when the goalkeeper bounces off the ball, he goes to free flight and while he has time to return to the place, he already scores a goal. Immediately after hitting the ball, we are trying to get him back into place and add the tick to which he returned to the assessment with a small negative coefficient. + 20%

- An additional kick on the ball in front of the goal of others increases the chance of a goal, we will give for this a bonus in OF. + 60%

- Experiment with nitro! There were still a few days left before the first round, but I decided to test the nitro in advance. Intuitively, it seemed to me that nitro would greatly affect the gameplay, since on one balloon you can jump to the ceiling or fly across the entire field. For a start, I taught the use of a nitro attacker only, and then only in the air, but I haven’t collected any packages yet. Nitro was used during the flight in the now random 3D direction. The result was, to put it mildly, not very good, more than + 20% I could not squeeze, and the use of nitro on the ground did not bring results at all. So the issue with nitro was put aside for a while, although from that moment I tried to carry out tests with nitro on.

- Too much randomness! Random is nice, sometimes it gives out tricks that you don’t think of yourself, but on the other hand, when there are too many of them, the likelihood that everything will match is very small. And I had it too much. I decided to try to transfer the moment of the jump to an analytical basis. Since there was no horizontal acceleration in the air (let's forget about nitro), it was easy to calculate the time of the meeting between the robot and the ball (t1), and the height of the ball at that moment (h1). Now we calculate the time (t2) after which the robot will be at the height of h1, jump now. Here we get a quadratic equation, if it does not have a solution or t2 <t1, then jump early, otherwise we jump.

The result was a bit shocking to me, screwing the right jump to both the attacker and the goalkeeper, tests showed + 200%, i.e. The new version beat the old 3 times in goals, the real Grail! It was January 17th, after uploading the strategy to the server, it won a 200+ rating with a winning streak from 20 games and for some time headed the sandbox.

- Teach the goalkeeper to play. Until my goalkeeper activated he stood as a pillar. Easy walk to the side: x = ball.x / 4 gave a small increase.

- The rival must not be predicted, but assumed! Looking through the games, I noticed that I often get a goal after the goalkeeper bounces the ball right on the opponent and he scores a goal for me on the fly. To beat the ball around the opponent, you must first determine where he can be on the nth tick. Of course, we will not take Nitro into account. I could still determine the speed of the robot analytically, it seems to be the intersection of two circles. But with the available territory failed. Well, to hell with it, we are programmers (not by education), let the machine count for us. We divide the plane into n directions, move the robot in each of the directions, the end points will be the vertices of the polygon, which determines the reach.

How to use it? I added a tick, on which the opponent will be able to touch the ball for the first time, with a good odds on the OB. Since now the opponent was considered not a specific action scenario, but a certain reach cloud, I deleted the enemy robots immediately as they came into contact with the ground.

Result + 40%. In addition, this approach has two fat advantages: removing enemy robots firstly speeds up the simulation, and this in turn makes it possible to sort out more options, and secondly we don’t have to bother with the opponent’s management. Conclusion: profit!

“Stupid mistakes, but without them.” In the official simulator there are two lines:

if abs(ball.position.z) > arena.depth / 2 + ball.radius:
	goal_scored()

I do not know how anyone, but I left the function abs () as it is, but in vain. B-shnaya abs () (not to be confused with std :: abs ()) takes integer values, which means decimals are truncated. In practice, this means that I fixed the goal only when the ball was already a full meter behind the goal line. Replace abs () with fabs ()! Not the first time this abs () fails me.

- Nitro again. The second round was approaching and there was no place to put off the normal use of nitro. Podoptimiziroval his use, he took into account in the definition of the jump, allowed to use the goalkeeper, and also began to deliberately collect packages goalkeeper.

- Let's go back a little to the simulation. I have already said that 100 variants with one microtic are more profitable than one variant with 100 microtics. This is true, but this does not mean that everything is good with one mikrotik, without any additional measures, the discrepancies were quite serious, and this made it difficult to play football at a professional level. To improve the accuracy of the simulation, I applied the following methods:

  • During the jump, two additional microtices.
  • When we combine a lot of microtics, it is more correct to use the average speed rather than the final one in moving.
  • With a binary search we find a tick in which a collision occurs, and we perform two subticles: one before the collision, the other after. (I did not take into account the collision of the ball with a flat surface, there seemed to me a slight discrepancy.)
  • Running on a curved surface still gave great differences and sometimes the goalkeeper missed because of this. According to this, when my goalkeeper was on a curved surface, I used 10 microtics.

On average, it turned out about 1.4 mikrotik for one tick, and the accuracy is close to ideal. Of course, I have fastened these optimizations not at once, but gradually. I do not know how much they influenced the strength of the game, but I think that is very significant.

What we have


Due to the large number of significant improvements, the strategy grew steadily in the ranking, and before the second round, it almost reached the historical milestone - 5000 Elo rating, firmly entrenched in the first place.


Sometimes you can not even believe what combinations gives out random. Thanks to the kind anonymous from the community for the find.

the last week


In connection with such a good margin, I allowed myself not to chase after minor improvements, but to look for something more global, especially with regard to 3x3 games. However, experiments aimed at a more team game, or instill a special role for the third player, or teach the goalkeeper to go out during the game, ended in failure. A whole week was spent almost to no avail. Despite this, a couple of days before the final, I was still leading in the rankings.
And here came the last day before the final and I start losing to one, then to another and even to the third competitor. If you do not add, you can stay without a prize. All week nothing happened, what can I do on the last day? The mood was - at least give up.

Resurrection


After seeing a couple of lost games, I noticed that everyone is jumping like goatson nitro, pass the ball in the air, but mine can not get it. I tried the simplest thing that could lead to this behavior - I added the height of the ball at the moment of impact with a solid coefficient in the OB. Result + 50%, wow! Inspired by the result, I intensively began to twist the parameters (which I neglected during the whole competition), added new variants of the search, at the end I added time control, which allowed maximum use of time without risking my life. By the end of the day, it turned out + 150-200%, i.e. The new version almost three times in goals scored the previous one! Yes, yes, the very one that had hung in the first place for almost a week before the final and reached the 5000+ rating unprecedented in the history of the championship. The strategy was successfully tested on the server a couple of hours before the final. After that, I prepared it for the final launch: turned off assertions,

Finale, part one


The first part of the final was held at night. I followed the results until I fell asleep, I got up early in the morning. 3 waves were played (in the same wavelength everyone plays with each), I lost only one game against TonyK , and since Anton still sometimes lost to other players, I was in the lead with a margin of 7 points (for a victory they give 2 points). Quite a serious gap for 3 waves, but not enough to relax.

On the last day, of course, I didn’t intend to do anything serious, basically I twisted the parameters. Several changes were made, but since everything was tested in a hurry, I was not even completely sure that the effect was positive. In general, the increase was 0-20%. But Anton noticeably added and at least began to play no worse than me.

There is nothing to do, I had to send what I had and hope for luck and a supply of points.

Finale, part two


Fortunately, the games were sorted in such a way that at first the leaders played each other, this didn’t take long to worry, the first game was against TonyK. Good luck was on my side, I won the first fight, as well as all the other games in this wave, while TonyK lost another point. The gap of 10 points in two waves to the end is almost impossible to play, now you can relax.

The end result: 352 wins and 2 losses (both from Anton), 1st place with a margin of 12 points.

Thanks and other garbage


In general, I really liked the task of this year, it was “under me” (the simulation is mine, and the three-dimensionality does not frighten me), and the matches were spectacular, I think it was interesting to watch not only the participants.

I would like to thank the organizers for a wonderful competition.

I also want to thank all the participants, especially Anton Kozlovsky ( TonyK ) for competing in the final, Ivan Tyamgin ( tyamgin ) for competing in the sandbox and Alexey Dichkovsky ( Commandos ) for refraining from participation and thereby increasing my chances of winning.

Good luck to everyone in the next RAIK!

Also popular now: