Assignment task


    The problem of the best distribution of a certain number of works between the same number of performers. When solving it, they seek the optimal destination from the condition of maximum total productivity, which is equal to the sum of the performance of the performers. The most effective method for solving it is the Hungarian method. The assignment problem has many interpretations: the distribution of work between the mechanisms, the distribution of targets between fire weapons to maximize the mathematical expectation of the number of targets hit or average damage, etc.




    1. The task (formulation).


    We will be easier (in terms). =)

    Imagine the problem: you need to effectively implement some relatively large project. For the project, a team of first-class developers, versed in various fields, was selected. And so it was lucky that there is information (accumulated statistics) about the degree (effectiveness) of technology ownership for everyone. But, unfortunately, there are not so many developers in the team, or rather, as many as the technologies will be used in the project.

    Necessary: ​​in the most effective way to use the greatest number of technologies, assigning each task its own developer. those. optimally distribute among them the areas of responsibility for technology, taking into account their personal abilities.

    2. Approach to the solution


    In the end, all business operations can be reduced to three words: people, product, and profits © . If we try to depict the task set for us on a piece of paper, we will surely get a similar illustration as in the picture (with such pictures ).

    The result is a count . Tops on the left are developers, tops on the right are technologies (tasks). The ribs that connect them - mean how much the developer understands it. These numbers, i.e. the degree of developer ownership of this technology is very important, but we will turn to them a little later. In the meantime, we have already correctly outlined the direction in which this problem is effectively solved:




    3. Counts


    The very basics of graphs were presented in the article ( max cost ), so I’ll immediately turn to the terminology related to this problem:

    A bipartite graph is a graph in which there is a partition of the set of vertices into two parts (parts) such that the ends of each edge belong to different parts. Our task also has a clear division: some peaks are developers, others are tasks, and there are only communications (ownership) between developers and tasks.
    A matching of an undirected graph G is a subset M of its edges E, chosen so that no two edges from M are adjacent, i.e. don't have a common top. In terms of our task, “assignment” will be a synonym for this .so that each involved developer takes on a separate task. And it didn’t happen that two developers are working on one problem, or one “poor thing” was responsible for two tasks.

    In graph theory, our problem, oddly enough, is called the Assignment Task (AP) . =) It is a special case of the problem of finding the maximum matching. In fact, we are striving to maximize the use of resources so that the maximum number of technologies can be worked out at the same time, therefore, in terms of graphs, we are trying to find the “maximum matching”, to make the maximum number of developer-task pairs.

    4. Maximum matching


    To simplify our lives, we do not pay attention to the abilities of people. We just want everyone to find a job. The first few developers who came to hand to offer to work with the technology familiar to them would not be a problem. Continuing in the same vein, we will distribute several more tasks, but the matching constructed in this way is unlikely to be maximum. A situation like the one shown in the figure is possible:



    How to increase matchings (appointments)?


    We select an unused developer who has not yet been assigned a task. Let's see what he could handle, i.e. familiar technology. If you found among them free - excellent, this is what we were looking for. Assign. And if the task is already "occupied" by another developer? Let's try a busy developer to find another free technology, because in this case we would assign this to our unused ward. In general, from the top of an unused developer or developer to whom we are trying to reassign a task, we look at all the technologies he is familiar with for free:
    • if you find a free one - the search is completed
    • if the task has already been assigned to someone, then after passing along this edge of matching, we will try to “reassign” the technology to the developer participating in this assignment

    During this round of the graph, we try to get into the “free task” from the “unused developer”. Thus, the search “unfolds” in the following tree:



    Add a little more terminology from graph theory, in simple words: An

    exposed vertex is a vertex that is not involved in the current matching. Those. either an "unused developer" or a "free task."
    An alternating chain is a chain whose ribs alternately lie or do not lie in the matching. (... - possession of technology - assigned task - possession of technology - assigned task - ...)
    Alternating tree - a tree consisting of alternating chains An
    augmented chain is such an alternating chain, the initial and final vertices of which are exposed. This is what we are looking for! =)
    Augment tree- accordingly a tree in which at least one of the branches is an augmented chain.

    So they found a way to increase matching, trying to get the maximum. We need to build an altening tree. When it becomes augmented, look for augmented chains from the "unused developer" to the "free task" and then "reassign" tasks along them. This is beneficial because increases the number of "tasks in processing" by 1:



    Now we can already use the largest number of technologies in the project. It's time to take into account another important condition for the problem posed to us: the effectiveness of technology ownership. We want to optimally assign tasks to everyone.

    5. The Hungarian method.


    Find a solution with maximum total efficiency (cost). It sounds, in a way, similar to the task of optimal backpack packing. Makes me think. Now, if we had the opportunity to act on the principle of "greedy algorithms."

    For starters, we would assign tasks to all developers to the limit in accordance with their maximum abilities. If all the developers were able to immediately distribute the tasks - excellent. But this does not happen often. Suddenly two people, optimally own the same technology, who will get it and what to do in this situation? One of the developers will need to find another free task, also the most appropriate to his abilities. If under the current (maximum requirements) conditions there is no free task, then you will need to try to find among the tasks, having previously underestimated our requirements a little. How to artificially lower the ability of developers in the graph. If under such conditions we find a free task, we get an augmented tree. “Change” according to the matching chain, after which it will be +1.

    The strategy is clear.


    We have almost guessed the principle of the Hungarian algorithm. But how is it possible to construct a solution on the basis of the “greedy algorithms”: assign it to the maximum abilities to the stop, then slightly underestimate the abilities and add new tasks to the discussion, assign them all the way, understate ... etc.? How to evaluate the abilities and optimality of the current appointment?

    The whole “trick” of this algorithm is as follows. We are given only one parameter in the graph - the effectiveness of solving a specific problem by the developer, which is indicated on the edges. This value is assigned to the developer-task pairs. We will “divide” (separate from pairs) these quantities into two. Artificially add two additional parameters. Some values ​​will be assigned to the tops of the tasks, others to the tops of the developers.

    I will try to give the following interpretation:
    • among developers, we will indicate their “abilities” , let’s say in units of “forces”, simply indicating how effectively we can use them or engage them.
    • for tasks, we indicate their “knowledge” , or, so to speak, “oversupply”. We will also measure this parameter in “power”. An excess of attention to the problem arises in the following situation. If we “underloaded” some developer, i.e. he is able to solve the problem by 5, and he got only by 3. Then he has 2 more “forces” that he, in principle, can devote to some of the tasks he knows. Run between classrooms, advise on the phone, give advice to those involved in his favorite technology.




    Thus, we “divide” the values ​​indicated on the edges into 2 values ​​assigned to the peaks: the efficiency of solving the problem = the ability of the developer + knowledge of the problem. In principle, it is logical. The more capable the developer or the more known the technology, the better this part will be implemented in the project. More effective.

    In the end, after we find a solution, we will of course only take into account the values ​​on the edges, but now this “trick” will help us find a solution. =)

    6. Description of the algorithm


    We initialize the graph. Being " stubborn optimists ", for each developer we calculate his maximum "ability" according to technologies familiar to him, and assign him this number. Everyone enjoys doing the kind of work for which he is best suited © . Nothing is known about the tasks, so we initialize their “knowledge” with zeros.

    When searching for a “free task” for a “unused developer”, we now restrict ourselves to (we call them) the optimal edges of the graph, i.e. those for whom the equality holds: the effectiveness of solving the problem (rib) = the ability of the developer (top) + the knowledge of the problem (top).



    Next, we do the same as in the search for the maximum matching. We take turns of unused developers and, looking for them free tasks, we build an alternating tree (consisting of alternating chains), but only along the optimal edges. Further, 2 situations are possible:
    • I managed to find a free task. The tree has become augmented. “Reassign” tasks, increase matching. We begin to build an alternating tree anew, because you never know how the graph has changed
    • We did not find (did not reach) the free problem on optimal edges. And she is, because After all, we started with a free developer, and in the graph we have the same number of tasks and developers. The resulting alternating tree becomes the so-called Hungarian (the whole method is also called). In this case, we will need to slightly lower our requirements for developers and start the search again. Failure is simply the opportunity to begin again, this time more intelligently (c) .




    So we came to the last moment of the Hungarian method for which all these additional parameters and “abilities” were conceived. Suppose that by building up an alternating tree, we eventually got a Hungarian tree. Consider which vertices fall into this tree:
    • Unused developers, as it is with them that we begin to cost a tree
    • Developers and tasks that can be reached on the optimal edges of unused developers. Because it is through their “reassignment” that we are trying to employ the latter.

    Outside this tree, in the remaining column will be present:
    • Developers and tasks that are in matching, but inaccessible from free peaks (developers). They found work for them - there is nothing to touch them yet.
    • Tasks unattainable by optimal ribs - we will need to get to them. Therefore, when building a tree, we will mark the peaks that we managed to get into. And these tasks, accordingly, will remain unmarked.

    Further in the cycle, we will run along the “border” of the tree: along those edges that connect unused developers or developers reachable from them (maybe they can be “reassigned”) with related tasks (along non-optimal edges). Using these edges, we will calculate the minimum “inconsistency” of the developer’s abilities so that he can start this task: delta = min (developer’s ability (vertex) + task knowledge (vertex) - problem solving efficiency (edge)).



    Then inside the Hungarian tree we:
    • We will lower the ability of developers on delta to “attach” in the least painless way at least one edge to the alternating tree, along which the next time we will continue to search for a free task
    • We will increase the “knowledge” of the problems on delta so that inside the already constructed augmented graph the edges remain optimal. Those. to maintain equality: the effectiveness of solving the problem (rib) = the ability of the developer (top) + the knowledge of the problem (top)

    Mini-interpretation: we lower the ability of developers to subsequently “attach” at least one of them. We will attach him, but he will not work in accordance with his qualifications. He could have done more. Therefore, he frees up a certain amount of time to consult colleagues on a task in which he is most competent. She is becoming more studied as a team. She, in turn, was probably engaged in by another developer, who now will also be able to substitute if something happens. It is possible to lower his competence on the study of the problem. And so on, “along the chain” in the team, the “knowledge” of tasks increases and the ability of developers to find assignments decreases slightly.



    All. All steps of this method are considered. Continuing in the same vein ... Success is not final, failure is not fatal: it is the courage to continue that counts .



    7. Algorithm in words, very briefly


    Now let's collect everything to the heap:
    • Initialization. For developers - max abilities. Tasks - not studied.
    • So far, not all developers have found tasks.
      • So far, it has been possible to construct an augmented tree (to find free problems) using optimal edges
        • "Reassign" tasks, increasing matching
      • Did not reach the free task. Hungarian tree.
        • Reduce the ability of developers by min value


    8. Listing


    The code, of course, will be shorter than my entire description. =)

    I took it here . In my opinion, a very good implementation. The only difference is that the author has the code for the method of minimizing appointments (for example, if there is a salary on the edges), and in the article we distributed tasks in order to obtain maximum efficiency. Therefore, having slightly modified the code, I will give the implementation of the maximum method below:

    int n;
    vector < vector > a;      // Матрица эффективности a[разраб][задача]
    vector xy, yx;             // Паросочетания: xy[разраб], yx[задача]
    vector vx, vy;            // Альтернирующее дерево vx[разраб], vy[задача]
    vector maxrow, mincol;     // Способности, изученность

    bool dotry (int i) {
        if (vx[i]) return false;
        vx[i] = true;
        for (int j=0; j         if (a[i][j]-maxrow[i]-mincol[j] == 0)
                vy[j] = true;
        for (int j=0; j         if (a[i][j]-maxrow[i]-mincol[j] == 0 && yx[j] == -1) {
                xy[i] = j;
                yx[j] = i;
                return true;
            }
        for (int j=0; j         if (a[i][j]-maxrow[i]-mincol[j] == 0 && dotry (yx[j])) {
                xy[i] = j;
                yx[j] = i;
                return true;
            }
        return false;
    }

    int main() {

        // ... чтение a ...
        
        mincol.assign (n, 0);
        minrow.assign (n, 0);
        for (int i=0; i         for (int j=0; j             maxrow[i] = max (maxrow[i], a[i][j]);

        xy.assign (n, -1);
        yx.assign (n, -1);
        for (int c=0; c         vx.assign (n, 0);
            vy.assign (n, 0);
            int k = 0;
            for (int i=0; i             if (xy[i] == -1 && dotry (i))
                    ++k;
            c += k;
            if (k == 0) {
                int z = INF;
                for (int i=0; i                 if (vx[i])
                        for (int j=0; j                         if (!vy[j])
                                z = min (z, maxrow[i]+mincol[j]-a[i][j]);
                for (int i=0; i                 if (vx[i]) maxrow[i] -= z;
                    if (vy[i]) mincol[i] += z;
                }
            }
        }

        int ans = 0;
        for (int i=0; i         ans += a[i][xy[i]];
        printf ("%d\n", ans);
        for (int i=0; i         printf ("%d ", xy[i]+1);
    }

    * This source code was highlighted with Source Code Highlighter.


    9. Total


    If someone sees Hungarian for the first time. And after reading the description, and after it the listing - you will get a confident impression “yes, here on the listing and without these descriptions everything is clear that it was crucified”. I still hope that at least in part the description has added understanding to the algorithm. I will be sincerely happy for you! and, in turn, it will make me feel a little that I wrote, probably not in vain. =)


    Also popular now: