Maximum flow of minimum cost

    The transport problem (classical) is the problem of the optimal plan for transporting goods from warehouses to points of consumption on vehicles.

    For the classical transport problem, two types of tasks are distinguished: a criterion of cost (achieving a minimum cost of transportation) or distances and a criterion of time (a minimum of time is spent on transportation).

    There are a lot of text under the cut . tells one of the options for solving this problem "in pictures" for those who are little familiar with graphs . Listing attached.


    The article somehow slipped on Habré where the question was raised, and whether articles about the basic algorithms are necessary. I decided to respond to requests and talk a little about algorithms on graphs and their practical application. I did not know what level of knowledge to rely on, so I chose one relatively complex and theoretically-practical algorithm so that the article was at least partially applied in nature. At the same time I will try to tell you very much even for those who are not particularly familiar with graphs.

    Task (fairy tale):You are the owner of a certain factory that produces “Goods”, and recently you were lucky to sign a contract with one large company located in another city for the supply of goods to their retail network. Since it is very far away (in Vladivostok), goods will have to be delivered by air. During the telephone conversations, the Partner inquired, “ But what volume of deliveries per day can we expect? ". You thought ... You have your own trucks (truckers) transporting. The airport is far away. After reviewing the accumulated statistics of transportation, you found that in your own area during transportation there are some restrictions: on the roads there are points of cargo inspection, weight control, some roads are completely repaired. Call it all “ bandwidth.»Expensive a day. Based on these conditions, you need to know: how many boxes of “Goods” per day can you bring to the airport? At the same time, you want to effectively conduct business and deliver goods by the shortest routes, because it is the wear of tires, mechanisms, and generally depreciation expenses.

    Total: how many boxes can you transport to the airport per day, taking into account the capacity of the roads, while keeping the total distance of the routes to a minimum?

    The task is the most on the graphs. The solution will be built gradually.
    There are no big problems, there are just a lot of little problems. (c)

    I will tell the algorithms, so to speak , because there are plenty of descriptions on the net.

    Basic concepts. Graph? Baron?


    The road map in our case is presented in the form of a graph. The vertices are intersections, and the edges of the graph are roads. The ribs (roads) are assigned their characteristics: distance (to the next intersection), as well as daily throughput.

    In the code, a graph is represented either in the form of adjacency lists or adjacency matrices. For simplicity, we will use the adjacency matrix. If in the adjacency matrix at the intersection of [u] and [v] the vertices are “1”, it means that these vertices (intersections) are connected by an edge (road). It is not necessary to designate exactly “1”, in the matrix it is very convenient to store other useful information attributed to the edge: for example, distance, and throughput (in a similar matrix).



    The figure shows a matrix symmetric with respect to the main diagonal, i.e. M [u] [v] = M [v] [u]. This means that we are given an undirected graph and you can go along the edge in any direction (round-trip). If in the matrix M [u] [v] = 1, and in the opposite direction M [u] [v] = 0, then the graph is directed and we can go along the edge only from the vertex [u] to [v].

    The road capacity of our roads will be written into the matrix C [..] [..], which, generally speaking, is a directed graph. After all, we need roads in order to drive from the "factory" in the direction of the "airport". A directed graph with given bandwidths (plant and airport) is called a network .

    When for a graph it is necessary to calculate a certain characteristic, but not massively “from all-to-all,” but let's say the distance from one vertex to the rest, it is much more convenient to use an array (less memory). Those. let's say in the [u] cell of the array dist [..] we will store the distance to the [u] top from the "factory". Similarly, we will use arrays when traversing the graph in order to mark already visited vertices (mark), record how many boxes we brought (push), and where we came from to the vertex (pred).

    OK. Excellent. We know how to convert our map to a graph. And how will we deliver the boxes to the airport? We need to be able to find a way from the "factory" to the "airport". For this we will use ...

    The breadth first search algorithm, BFS.


    So far we only consider: the adjacency (neighborhood) of the vertices of the graph, not considering bandwidths and distances.

    BFS is one of the most basic algorithms that form the basis of many others.
    A simple description (the picture will be below). We are now standing in some starting (plant) peak [s], from which only neighboring peaks are visible along the edges. And we really need to get to the top [t], which is located somewhere in this graph, as soon as possible. Next we do this. We look along the edges (namely, free roads) of our neighbors' peaks: are there [t] among them. If not, then we record all (first discovered) neighbors in the “need to go there” queue. When we looked through all the neighbors - we mark our peak - "have already been here." We get the first unvisited peak from the queue and go to it.

    We continue the search in the same way. At the same time, we ignore those peaks that we once visited (not a step back). If you meet [t] on the road - excellent, the goal is achieved!

    In order not to go into the same intersections several times, we will mark them in the mark [..] array. After examining the neighbors from the [u] peaks, we put mark [u] = 1, which means that we have already “visited” the [u] th intersection.

    In the picture: at the tops - serial numbers are written



    After completing the algorithm, we get the following picture:



    Note the main features:
    • each vertex we get exactly (no more than) once
    • put the vertices in the queue when they are first viewed
    • from our plant we radially (undulating) find the other vertices in the graph
    • "Inspection radius" is constantly increasing
    • when we find the "airport", the number of ribs (roads) between the "factory" and the "airport" will be minimal. T.O. we will find the "airport" by the quickest inspection of the count
    • We look only on the free roads on which you can transport boxes!
    • Ehhh ... until we take into account real distances (mileage).

    Now we know how to find the way by which our goods boxes can be transported to the airport. Okay ... We’ll take them along the road, and mark this on our map. This mark - “how many boxes, on which road (edge) and in which direction we are taking” we will call “ flow ”. We will note this in the matrix (flow) F [..] [..]. Those. on the way from [u] to [v] we carry F [u] [v] boxes.

    It's time to run into reality - you have to reckon with the "bandwidth", which is indicated by the matrix (capacity) C [..] [..]. Indeed, on the way from [u] to [v] we can transport no more than C [u] [v] boxes. That's a pity.

    We acted farsightedly. While we were looking for the “airport” with BFS, while we were marking the “visited peaks”, we also noted from which intersection we arrived — we recorded it in the pred [..] array. Those. we reached the vertex [v] from the vertex pred [v]. And just beforehand, we brought in another useful array: push [v], that is, how much we could “push” the boxes to the intersection [v] along some road [u] - [v].
    And they kept it up to date: push [v] = min (push [u], C [u] [v] -F [u] [v]);

    Due to this, until we have to once again “unwind” the trajectory from the “airport” to the “factory” in the reverse order to calculate how many boxes we can carry along this route.

    Push [v] = push ["airport"] = flow = here! how many boxes were taken to the airport on the way found. Once we unwind the route along all edges (roads), and add a “flow” flow to all edges of the path.

    But, even though the problem is about real quantities: the number of boxes, throughput and distances, you may still have to face a “ minus ” ...

    Increase Flow (or Ford-Fulkerson Algorithm)


    Now we take into account: the adjacency (neighborhood) of the vertices of the graph, the directional bandwidth of the edges, but we do not consider distances yet.

    When we increase the flow (of boxes) from the top [u] to the top [v], we naturally perform the operation: F [u] [v] + = flow, but in the opposite direction we decrease the flow F [v] [u] - = flow; That's why. This situation is possible:

    In the picture: on the ribs - signed (stream / throughput)



    For the first time, carrying a stream of 3 boxes to the top [i] and finding the edge [i] - [j]: We transported min (push [i], C [i] [j] - F [i] [j]) = min (3, 3-0) = 3 boxes, and marked this as F [i] [j] + = 3, and in the opposite direction we set: F [j] [i] - = 3.

    The second time, Once at the top [j], we try to push min (push [j], C [j] [i] -F [j] [i]) = min (6, 0 - (- 3)) = min (6, 3) = 3 to the vertex [i]. Against stream +3, we pushed -3 boxes and received compensation for the flow along this road. But in the direction of the “airport” in the next iteration, we additionally sent the remaining 3 boxes.

    Interpretation: from the warehouse [j] we called to the warehouse [i], and said: “Keep your 3 boxes for yourself - find another use for them, we brought our 3 instead.” Although the algorithm itself kindly found them application.

    We continue to search for the stream:


    We agreed to persistently continue to look for ways to the "airport", while we succeed, and to transport boxes on them. Roughly speaking, this is called the maximum flow search algorithm , or the Ford-Fulkerson algorithm . And since we use BFS to “open” new delivery routes, this is called the Edmonds-Karp algorithm .

    When we “saturate” the roads completely by transporting our boxes, we will answer to the Partner the question “How many boxes a day can we transport to the airport?”. But it's time to think about their own depreciation costs ... Tires, gasoline, wear ...

    It has already become clear that when BFS searches for a graph, we will have to deal with negative values, such as reverse flow (and it has consequences in “financial terms”), even if we are talking about distances. In general, it’s time to consider additionally the distances ...

    Distances Hit the road, Jack!


    It is time to completely finish this task: the adjacency (neighborhood) of the vertices of the graph, the directed throughput of the edges, the distance.

    We continue to run BFS until we load the roads with our boxes “to the stop”:



    Now let's look at what happened. We will check from the “airport” side: if some box reached us over a distance of 15 km, that means if we refused it, then we would save 15 km. travel (i.e. subtract 15), but you need to try to find (attach) him another way of movement.

    Let's try to walk along the ribs in the forward (on free roads) and reverse (pushing back and saving) directions from the “airport”:

    In the picture: on the ribs - signed (flow / throughput), and above - the distance



    In the picture above, we found a “negative cycle” -6, still striding along the available (free or pushing upstream) ribs. Making one revolution in it, we can reduce the distance for the vertices participating in it by -6. This means that you can save on the delivery of boxes transported in a cycle. Just letting the boxes go “in a loop.” In the picture above - we will save 6 km. the way.

    Now we know how to solve the problem, but in order to detect these cycles ... Consider:

    Bellman-Ford Algorithm


    It is used to find the shortest distance from the vertex [s] to the remaining vertices. But unlike BFS, the short path will not be in the sense of the number of edges of the graph along this path, but in the sense of the summed "distance" along the edges of the path.

    But we do not need it for this. One of its key features that distinguishes it from Dijkstra's algorithm is that it is able to work on graphs where the weight of the edges can be given by a negative number. The algorithm can detect a side effect of such graphs - cycles of negative magnitude. What we need!

    The algorithm is somewhat more complicated. This implementation does not correlate with Cormen’s bible, but it also works fine. In appearance, it somewhat resembles BFS, and therefore I will try to explain it, starting from it.

    Starting from a certain vertex, we look at neighboring vertices along the “accessible” edges and try to improve the distance to them in the dist [..] array and make it as small as possible. This process is called "relaxation." If we “found” (along the edges) such vertices, then we update the distances for them and put their vertices in the queue “try to improve the graph from them”. Very similar to BFS! But we don’t mark the peaks (“already visited”) and if you have to go to the same peak twice, then we will do it.

    But the question is, are we prepared for the fact that there will be “negative cycles” that can be rotated forever, reducing the distance to the peaks? The process will not end. Therefore, the "radius of inspection" of the vertices is limited by the number N (the number of vertices themselves). This will be guaranteed enough to calculate the minimum distance to any vertex, and most importantly, the algorithm will end in any case.

    To do this, put the first vertex in the queue, and after it the “stub”, thus indicating that the vertices in the queue are in the “inspection radius 0”. When, taking out the next peak from the queue, we suddenly get our “stub” - we set up a new one, designating the next “inspection radius”. Here, in general, is the whole logic of the algorithm. =)

    Improving the distance to the peaks is verified by the following inequality:
    dist [v]> dist [u] + edge_cost (u, v)

    In the picture: on the edges - the length, and in the tooltip - the shortest distance found at the moment



    Note the main features (unlike BFS):
    • we can get to the top several times if the distance to it has been improved. Having then entered this peak, we will try to improve the distances already from it
    • the peaks to which the distance has been improved are queued
    • from our plant we radially (undulating) along the distances (and not along the edges) we find the remaining peaks
    • We’ll only look at the free roads where you can transport (or push back) the boxes. Therefore, given distances and throughput

    Viewing the graph in “radius N (number of vertices)” ensures that for all the vertices we have found the minimum distance. And there is nothing more to reduce. And if some of the peaks are “drawn” into the “negative cycle”, then it can be easily detected by checking for violation of equality. Indeed, in a cycle, the distances infinitely decrease:
    dist [v]> dist [u] + edge_cost (u, v)

    Therefore, if this inequality holds for the vertex [v], it means that it participates in a negative cycle. What is needed! “Unwinding” from her the path along which we got into it, we will spin along the (her) cycle.

    Everything - the cycle is discovered! All that remains is to direct the flow of boxes along it "backwards", and thereby increase the efficiency of conducting business.

    Minimum Cost Max Flow Algorithm:


    • We start finding the Edmonds-Carp Max Flow:
      • While BFS finds its way from the "factory" to the "airport"
        • carry boxes on it
    • So far, the Bellman-Ford algorithm finds "negative cycles":
      • Turn the flow in the "negative cycles" back.
    • Got the maximum flow of minimum cost (distance)


    All. We call the Partner and inform you how many goods per day we can deliver to him. And we think how to apply the money saved. =)

    int C[MAX_N][MAX_N];    // Матрица "пропускных способностей"
    int F[MAX_N][MAX_N];    // Матрица "текущего потока в графе"
    int P[MAX_N][MAX_N];    // Матрица "стоимости (расстояний)"
    int push[MAX_N];        // Поток в вершину [v] из начальной точки
    int mark[MAX_N];        // Отметки на вершинах, в которых побывали
    int pred[MAX_N];        // Откуда пришли в вершину [v] (предок)
    int dist[MAX_N];        // Расстояние до вершины [v] из начальной точки
    int N, M, s ,t;         // Кол-во вершин, ребер, начальная и конечные точки
    int max_flow;
    int min_cost;

    void file_read()
    {
        int u, v, c, p;
        in >> N >> M >> s >> t; N++;
        
        for(int i = 0; i < M; i++)
        {
            in >> u >> v >> c >> p;
            C[u][v] = c;
            P[u][v] = p;
            P[v][u] = -p;
        }
    }

    int edge_cost(int u, int v)
    {
        if( C[u][v] - F[u][v] > 0 ) return P[u][v];
        else return MAX_VAL;
    }

    int check_cycles()
    {
        for(int u = 1; u < N; u++)
            for(int v = 1; v < N; v++)
                if( dist[v] > dist[u] + edge_cost(u, v) )
                    return u;

        return MAX_VAL;
    }

    void init()
    {
        for(int i = 1; i < N; i++)
        {
            mark[i] = 0;
            push[i] = 0;
            pred[i] = 0;
            dist[i] = MAX_VAL;
        }
    }

    // Алгоритм Поиска в ширину

    int bf(int s)
    {
        init();
        queue Q;
        pred[s] = s;
        dist[s] = 0;

        Q.push(s);
        Q.push(MAX_N);
        
        int u, series = 0;
        while( !Q.empty() )
        {
            while( Q.front() == MAX_N )
            {
                Q.pop();
                if( ++series > N ) return check_cycles();
                else Q.push(MAX_N);
            }

            u = Q.front(); Q.pop();
            for(int v = 1; v < N; v++)
                if( dist[v] > dist[u] + edge_cost(u, v) )
                {
                    dist[v] = dist[u] + edge_cost(u, v);
                    pred[v] = u;
                    Q.push(v);
                }
        }
    }

    // Алгоритм Беллмана-Форда

    int bfs(int s, int t)
    {
        init();
        queue Q;
        mark[s] = 1;
        pred[s] = s;
        push[s] = MAX_VAL;
        
        Q.push(s);
        while( !mark[t] && !Q.empty() )
        {
            int u = Q.front(); Q.pop();
            for(int v = 1; v < N; v++)
                if( !mark[v] && (C[u][v]-F[u][v] > 0) )
                {
                    push[v] = min(push[u], C[u][v]-F[u][v]);
                    mark[v] = 1;
                    pred[v] = u;
                    Q.push(v);
                }
        }

        return mark[t];
    }

    // Алгоритм Форда-Фалкерсона

    void max_flow_ff() 
    {
        int u, v, flow = 0;

        while( bfs(s, t) )
        {
            int add = push[t];

            v = t; u = pred[v];
            while( v != s )
            {
                F[u][v] += add;
                F[v][u] -= add;
                v = u; u = pred[v];
            }
            flow += add;
        }

        max_flow = flow;
    }

    // Алгоритм вычисления Максимального поток минимальной стоимости

    void min_cost_flow()    
    {
        max_flow_ff();
        
        int u, v, flow = 0;
        int add = MAX_VAL;
        int neg_cycle;

        neg_cycle = bf(t);
        while( neg_cycle != MAX_VAL )
        {
            v = neg_cycle; u = pred[v];
            do
            {
                add = min(add, C[u][v]-F[u][v]);
                v = u; u = pred[v];
            }
            while( v != neg_cycle );

            v = neg_cycle; u = pred[v];
            do
            {
                F[u][v] += add;
                F[v][u] -= add;
                v = u; u = pred[v];
            }
            while( v != neg_cycle );

            neg_cycle = bf(t);
        }

        for(int u = 1; u < N; u++)
            for(int v = 1; v < N; v++)
                if( F[u][v] > 0 )
                    min_cost += F[u][v] * P[u][v];
    }

    void file_write()
    {
        out << max_flow << endl;
        out << min_cost << endl;
    }

    void main()
    {
        file_read();
        min_cost_flow();
        file_write();
    }

    * This source code was highlighted with Source Code Highlighter.


    // And if we took such conditions: peaks - intersections of streets, edges - roads, throughput - the number of lanes (allowed speed, etc.) minus the current number of cars on these roads. We will find the maximum flow from “A” street to “B” street - what are the currently not free roads in the city? Of course, much more parameters need to be taken into account, but the basis is graphs. It is interesting. =)

    Total


    Don’t scold me very much, please.

    Starting a post about graphs, I didn’t know what level of competence to count on: I decided not to talk just about, say, Dijkstra for now, because its accessible description is very easy to find on the net . Suddenly, every second writes by heart. But I remember for sure that the Habr-comrades were interested in the practical side of these algorithms. Therefore, he took the "spherical" problem and in its terms tried to clearly tell about the graphs.

    I hope that someone will be interested to read about graphs and an example of their application. Moreover, I was prompted to write an article by a desire to remind students (schoolchildren), or graduate students who teach them programming, about one of the most famous programming olympiads ACM ICPC! For those who have not yet decided (did not dare) in the early years of the university, to assemble a team, stay up late in computer classes, discuss the asymptotic behavior of algorithms, come up with solutions and counter-examples for them. Algorithms are interesting and a good reason to get together, and the experience of team play is priceless. Join now!

    Also popular now: