AI Challenge: Ants AI Challenge: animating ants

    In this article, I will tell you how to write a pretty good bot for the Google AI Challenge . It is noteworthy that complex technologies related to AI will not be needed, and the basic implementation fits in a thousand lines of code in C ++. The methods themselves can be considered as some kind of Generic algorithm, and on the basis of them you can build a bot that takes into account some strategic features, which will probably play even better. In any case, a good “quick start” for those who have so far failed.

    1. Some experience in participating in the AI ​​Challenge

    Judging by the comments on other topics about the AI ​​Challenge, many have questions about the "local" rating system. Actually comments.

    1. The rating of the bot is evaluated relative to its success in games against other bots, i.e. the Skill number is a relative value, and after each download of the version to the server it is reset to zero (because it is not known whether the new version will be better than the previous one).
    2. Each successful game adds a certain amount to Skill depending on the place (I remind you that the goal of the game is not to collect “food” by growing a huge “colony”, but to capture enemy anthills) and the skill of other players - and the larger the number of the game, the less this increment; thus, accidentally losing (at first) to a weak player will slow the Skill increase for a long time.
    3. For an approximate stabilization of the rating in the current situation, it takes about 5-7 days (“matches” are rather rare).

    It follows from the foregoing that it is too late to try to “iteratively” fit the algorithm to other players, so it is highly advisable to test the bot locally, and especially not waste time on the Starter Package upgrade.

    Limiting the execution time of the algorithm per move: 500 ms, and may vary with the turntime parameter. However, it is highly desirable to complete the execution of the algorithm faster than this threshold by at least 50 ms, otherwise it is very likely to get a timeout due to I / O buffering (all game data is transmitted via pipes), or sometimes funny things turn out:



    The memory limit is 2GB (the limitation of a 32bit platform), it is guaranteed (more likely to be read as "promise") about 1.5 GB of memory, which will not be spread out by the swap file. This is a fairly large number (for cards of 200 by 200 cells - that), which directly encourages you to cache intermediate results, which we will do to get a good algorithm for finding paths to objects.

    2. The algorithm for finding the path

    It’s hard to talk about how good a “one or the other” strategy would be, as long as the ants don’t know how to "walk" through the maze. From experience, the high-quality implementation of finding a path is always better than a good strategy if you use "exclusive OR"; it’s better, of course, both this and that.

    Suppose that we already know where to go (we will discuss this later), and with which ant we should do this. It remains to calculate the initial direction of movement (N, E, S, W) and command the system. To do this, it is necessary to calculate the shortest path to the target object (we can of course cover the entire map with “random movements”, in theory, but the enemy will destroy us in practice a long time ago).

    If the map did not contain impassable sections of “water” (dark blue cells), then the task would be trivial; consider the red line connecting the "ant" and the "water". Because Since an ant can walk only one cage in four directions, the distance to the object (excluding obstacles) coincides with the Manhattan distance . The path to the object will always occupy the same number of cells, if we use increments in the right direction. Subtract from the coordinates of the target (x1, y1) the coordinates of the ant (x0, y0) and get the number of steps (| x1-x0 |) with increment in X: sign (x1-x0) and the number (| y1-y0 |) in Y: sign (y1-y0) leading to the goal.

    Here is a clear illustration of the fact that the path length will be the same in any case (the number of path cells is the same for all three lines). Thus, we will write such heuristics to “useful” ones and first of all we will not disdain to check whether it is possible to reach the target object by moving in the correct increment (you can call such a function getSimplePath () , for example). What for? Sometimes it saves time (and quite substantial). But, of course, she will not cope with situations where water cells will lie on the path, or even more so, the initial direction of movement does not correspond to the final one due to obstacles (the case with the greenish line in the first picture).

    What to do with obstacles? Here the wave algorithm will help us: we take a matrix the size of the initial field, fix the target cell (writing 0 there) and recursively fill the matrix with values ​​in accordance with the image on the right (we bypass only those cells in which there is no (or has not yet been found) "water" ) The meaning of these values ​​is the number of moves needed to reach the target cell. That's right, as soon as we reach the cell in which our ant is located recursively, we can find the path to the goal - moving along the cells for which the values ​​of the number of steps are reduced. The algorithm works during O (cols * rows) time.

    Is it a lot or a little? In our case, | cols |, | rows | <= 200, all you need to get directions for up to 1000 ants (in real games there are never more), and each can have up to hundreds of potential goals. Altogether 200 ^ 2 * 1000 * 100, this is 4 billion iterations, most of which are written to memory. Common sense tells us that at 500ms this does not fit.

    Why then is such an algorithm chosen at all? To begin with, it should be noted that theoretically for a smaller number of actions without any intermediate information there is no way to go, at least you can construct an example that bypasses the O (cols * rows) steps (“spiral” in the whole field). Therefore, our task is to construct a method that works well when taking into account the availability of intermediate information (the state of the field changes slightly from one move to another). In addition, there is no point in worrying that he works slowly on the first moves, for us and the ants there is nothing.

    What will we remember? The entire matrix of steps of the wave algorithm for each target cell:
    char* WaveMaps[MAX_X][MAX_Y]; // указатели на карты волнового алгоритма
    WaveMaps[x][y] = new char[MAX_X*MAX_Y]; // инициализация конкретной карты

    Let me remind you that WaveMaps [x] [y] are considered for the target object with coordinates (x, y); in fact, these are all possible routes to this object. Thus, it is clear that the cells with water will not be targets (there is no sense in them, and you just can’t go), and most of the cells that are not filled with anything (food, ants, anthills) are not interesting to us - they will not be initialized.

    Thus, in one move, we can calculate the paths from each ant to a given point without additional overhead, determining which one will be faster (the first time we count the map - then we look in the cache).

    What will happen in the next move? During the game, ants wander around the field, “discovering” new cells - initially we know about filling only a small part of the maze, but as we move we find food, water, enemy ants, anthills. In this case, we will be interested in what to do with the cards when a new “water” cell is discovered.

    First, find out how to detect "water." To do this, it is enough to store a local copy of the state of the field, and at each new move to check whether there is a difference between the received game state and the saved one. Well, we found water at position (x1, y1). Then you need to recount all the cards in which the value at the position (x1, y1) is filled with the step number (and not the default value).

    Given that all cards are available in the tools package, we can estimate what percentage of reinitialization (re-calculation) of matrices for the wave algorithm we should expect. The result: 100%. It’s not at all great, but all because we built maps covering the entire playing field, but this makes no sense, because all routes can be divided into “long” and “short”. Short ones require precise determination of the cells of the route, and long ones are composed of blocks of short ones.

    Thus, a reasonable idea would be to keep a hierarchical map of the reachability of objects. In the case of my implementation, only two levels were enough (100ms for planning all routes for all ants in the worst reasonable case).

    Then the above algorithm is modified as follows:
    1. When filling out each card of the wave algorithm, we check whether the distance from the initial to the current point is greater than MAP_RANGE (if more, we exit);
    2. Build a full wave map for the low-resolution field (MAX_X / SCALE, MAX_Y / SCALE). Such a map will help to find the path between points with a distance greater than MAP_RANGE.


    When requesting a path for sufficiently distant points (x1, y1) and (x0, y0), we first look at a low-resolution map, get an intermediate point (x2, y2) and combine the paths (x1, y1) -> (x2, y2) and ( x2, y2) -> (x0, y0), recursion, yeah.

    The entered MAP_RANGE parameter, on the one hand, determines how often the wave algorithm maps will be completely recounted, and on the other, the final accuracy of very long paths. With MAP_RANGE> 2 * viewRadius, there are practically no errors, and the time (routes for all ants) hardly exceeds 100ms.

    Let me remind you that we needed this trick with MAP_RANGE and a low-resolution map in order to successfully use the caching of results, if possible rarely updating the cache (when the cell is examined within the radius of MAP_RANGE, its updating will completely stop).

    As for memory: in the worst case, the filling will be: 200 ^ 2 * (200/2) ^ 2 (full cards) + (200 / SCALE) ^ 4 (low resolution) bytes, which will not exceed the threshold of 500 MB with a reasonable value of SCALE (more than 5).

    3. Objectives

    A good way to win is not to make meaningless moves, and not to miss meaningful opportunities. Let's try to analyze what exactly this means.

    The goals can be:
    1. Enemy anthills
    2. Food (only if you have your own anthills)
    3. Own anthills (to protect)
    4. Own ants (to protect or improve the attack) - support
    5. Enemy ants
    6. Unexplored cells

    There are more there are no objects, respectively, and there are no other goals.

    I wrote out the goals in order of priority (let's call them programList), from the most important to the least important (but still necessary). Capturing anthills is the point of the game, therefore, ceteris paribus, it is better to try to capture anthill (except for some reservations that will follow) than chasing food. I would not want to lose my anthills, and my ants. If possible, just “kill” (without loss on your part) enemy ants - it is better to do this. Well, if there’s absolutely nothing to do, you can wander around the field.

    In order to find target objects, it is necessary to go around the entire field received, checking whether they satisfy the criteria of the target:

    foreach int program in |ProgramList|
     foreach Location loc in map
      if (MatchingProgram(loc, program))
       ProgramList[program].push_back(loc);

    Where MatchingProgram determines whether the given cell is suitable for the program:
    1. if the program == enemy anthills, then the cell must contain the enemy anthill;
    2. food, the cell contains food;
    3. its anthills, the cell contains its anthill and there is an indirect threat to it (the presence of an enemy ant is closer than ours);
    4. its ants, the cell contains its ant and there is an indirect threat to it (the presence of enemy ants in viewRadius);
    5. the presence of an enemy ant, and its enemies are more equal than the "allies";
    6. unexplored cells (cells in which we have not been, more on this below).

    Next, you need to assign an ant to each goal, which will be engaged in its implementation. For each purpose we look through the list of ants; we select the closest one for which the previous goal (set last turn) is worse than the current one (i.e. goes with a higher number in the list), or not worse && the distance to the previous goal is greater (with a margin of 1 step). On the one hand, this eliminates the “jerking” of the ant back and forth between the same goals (a common mistake among the participants), and on the other, it allows the use of updated information (in search of food we got to the enemy anthill).

    4. Locks

    As you know, two or more ants that fall on the same cell die. We want to prevent the senseless loss of "composition."

    The trivial way would be to check if the appointment cell is busy with another ant, as most of the simplest bots work. But in this case, blockages are not excluded, when the ants prevent each other from reaching their goals, and, as it often happens, crowd around the base pointlessly and the situation is impossible when some ants have no move at all.

    Therefore, you should separate those ants that have already moved during the course, and those that have not yet moved. When you try to get on an already occupied cell, two options arise:
    1. Choose a different direction for the current ant;
    2. Move the one who interferes ( blocker) to another cell.

    programAnt(MyAnt *ant):
     if (ant->alreadyMoved) return;
     repeat:
     direction = getDirectionByProgramList(program)
     if (moveResult(ant, direction) == mrBlockerNotMoved)
     {
       programAnt(blocker);
       program++;
       goto repeat;
     }

    It is important that the "interfering" ants do not move anyhow, but in accordance with the best possible program. Thus, the “programming” function of the ant as a result can shift as many neighboring ants as possible for the call - you only need to avoid mutual recursion with the same parameters.

    Another reason for the useless death of an ant is an attempt to wander in an area with an excess of opponents. To detect such situations, you will have to work hard and write battleResolution.

    This is not a problem, however, if you use the C ++ starter package, you will have to use state.grid, not enemyAnts, because in the list they are not distinguishable by type (if this is not corrected already, of course). Anyway ...

    5. Why starter package is not so good

    1. Ant objects received in myAnts do not store their parameters. I have already mentioned that it is not bad to keep its last program and path for each ant, but you can’t do this with State.h - the vector myAnts is created at each turn. Therefore, the correct way to store the parameters of ants would be to create a map of the form MyAnt * antMap [MAX_X] [MAX_Y], which you will update manually (this is not so difficult to do in the makeMove function for a specific ant).

    2. The updateVisibility function is unacceptably long. For a thousand ants, perhaps even longer than laying paths, because for each ant a lot of cells are checked for inequality that determines the distance. It would be much faster (which I did) to save for each cell a list of directly visible (viewRadius) and attacked (attackRadius), initialized when the cell was first examined. Then updateVisibility can mark cells visible for each ant using a prepared list, rather than brute force.

    3. std :: vectorgrid. Here we lose both speed and in some cases we increase memory usage even against Square grid [MAX_X] [MAX_Y]. The loss of speed is not a joke, since the object is cached very poorly, being scattered at different memory addresses.

    6. Research and control of territories

    Since we now know how to get to a specific object, how to implicitly prevent the loss of ants and implicitly organize attacks on enemy ants (except for explicit attempts to collect food, to capture the anthill), it remains to learn how to master the territory.

    Research involves the discovery of static objects in places where we have not been (anthills, water).

    Control - checking whether something useful (food) or harmful (enemy ants) has appeared.

    Both that and another, the essence is one and the same - cells in which we have not been for a long time . And this concept is quite easily formalized.

    After we assign the goals (programs from the first to the fifth) to the row of ants, there remains a certain number of ants that do not yet know where to go, and they need to be sent for a walk. So they should go to those cells that no one sees (visibility == 0). Of those "invisible" cells, priority is given to those that were never visible (lastSeen == NEVER) - research. Of the remaining ones - those that were visible on the earliest moves (max lastSeen) - control. Among the cells equal in the lastSeen parameter, the nearest ones should be chosen.

    Such an algorithm will quickly go around the entire field, and at the same time it will not allow for a long time to lose sight of the territory where food may appear.

    7. Reason to reflect

    Now our ants are quite viable and know what to do, however, the above algorithm does not guarantee victory.

    Firstly, because in a game it is impossible to construct an always winning algorithm. A simple example of a possible situation - on the first moves, all opponents attack you, from different sides, and so as not to attack each other. No arrangement of even a large number of ants around the base will protect it.

    Secondly, the order of goals that I defined on my own is correct (let's say it makes sense), but does not take into account the strategic features of opponents. In some cases, it is enough to quickly reach the enemy anthill and capture it (since it is not guarded), and sometimes it is a sure way to commit suicide. The priority of the goals will depend on the level of the enemy. With a weak adversary it is even useful to “swap" ants, and from strong ones it is sometimes better to "run" and wait for the moment of almost complete "exchange" of other players. On the first moves, we don’t know anything about opponents at all, and to live to the middle of the game with equal in level is luck (well, or lack of “bad luck”). Thus, the priority of goals itself implicitly contains the concept of strategy (for example, I did not say anything about a group attack, but it is carried out thanks to goal 4 - support),

    I hope that the above will help someone solve some issues, or at least show that, in general, the subject of the competition is not so difficult. Good luck

    Also popular now: