Graphs of road networks and algorithms for working with them
In mathematics, road networks (automobile and not only) are represented by a weighted graph. Settlements (or intersections) are the vertices of the graph, edges are roads, and weights of edges are distances along these roads.
For weighted graphs, many algorithms are proposed. For example, the popular Dijkstra algorithm for finding the shortest path from one vertex to another. All these algorithms have a common fundamental (for mathematics) feature - they are universal, i.e. can be successfully applied to graphs of any design. In particular, for each algorithm its complexity is known - it approximately corresponds to an increase in the execution time of the algorithm depending on the number of vertices of the graph. All this can be read in detail, for example, on Wikipedia.
Let's get back to practical tasks. Roads are represented by a weighted graph, but roads are not any graph. In other words, you cannot build a road network from any graph. Unlike a virtual graph as a mathematical abstraction, roads are built by people from real materials and cost quite a lot of money. Therefore, they are laid not as horrible as possible, but according to certain economic and practical rules.
We do not know these rules, however, working with road networks, it is quite possible to use algorithms that are effective for road graphs, although they are not suitable for graphs in a universal or mathematical sense. We consider here two such algorithms.
1. We will use weighted undirected graphs with non-negative edge weights. In particular, the roads within the region (country) are just such a graph.
2. Shortest distance matrix (MKR) - its small and simple example can be found in many road atlases. This sign is usually called something like this: "the distance between the most important cities." It looks like a part of the matrix below or above the main diagonal (from the upper left to the lower right corner), because on the other side of the main diagonal there are exactly the same numbers, in other words, the element M (i, j) = M (j, i). This happens because the graph, as mathematicians say, is non-oriented. Rows and columns correspond to cities (vertices of the graph). In reality, such a table is much larger, since all villages and intersections enter the top of the graph, except for cities, but it is naturally impossible to print such a large table in the atlas.
First of all, we will continue (mentally) our table to the upper part, we will obtain a MKR symmetric with respect to the main diagonal, and hereinafter we will keep in mind precisely such a table. In this case, a column with a certain number is equal to a row with the same number and it does not matter which concept to use. We use both to cross them among themselves.
Our MKR can be: a) known in advance, because we calculated it by one of the methods of searching for MKR; b) we may not know MKP, but determine it line by line as necessary. Line by line - this means that for the required line, distances are calculated only from the vertex corresponding to it to the remaining vertices, for example, by the Dijkstra method.
3. A couple more concepts. The eccentricity of a given vertex is the distance from this vertex to the most distant from it. The radius of the graph is the smallest of the eccentricities of all the vertices. The center of the graph is a vertex whose eccentricity is equal to the radius.
How it looks in practice. The center of the road network is the city or intersection that is the least distant from all other points in the network. Radius is the maximum distance from this central node to the farthest.
4. Degree of vertex - the number of edges that are attached to the vertex.
Graphs of road networks, the average degree of all the peaks is in the region from 2 to 4. It is quite natural - it is difficult and expensive to build intersections with a large number of adjacent roads, then it is no less difficult to use such a road network. Graphs with a low average degree of peaks are called sparse, as we see, the graphs of road networks are just such.
Note that a graph may have several centers, but we want to find any of them.
How is the problem solved in the general case? Full view of the FIBC. The maximum element in the row is searched (eccentricity of each vertex), and then the minimum is found from these maximum elements.
This is far from the fastest way. What is needed faster if, it would seem, the radius and center of the graph can be found once? For example, there are tasks and algorithms on them, where in the course of enumeration, the vertices are constantly “re-united” into groups, and the criterion for each group is its radius. In this case, the radius is recounted many times, and the speed of its search becomes an important parameter. How to find the radius faster?
The secret is that for graphs of road networks all the elements are not required to be viewed. In practice, just look at a very small part of all the lines.
Let's see how it turns out. Consider the values in one row of the MKP matrix, in other words, consider the distances from one vertex to all the others. It is easy to prove that the radius of the graph cannot be greater than the maximum value in this row, and cannot be less than the minimum value in this row. Mathematically speaking, we found the upper and lower bounds on the number, and if they match, we will find the number.
Suppose we found the values in only two rows A and B. Moreover, the maximum value in row A is equal to the minimum value in row B (this value will be at the intersection of column A and row B). It is easy to prove that A is the center of the graph, and the value found is its radius. The problem is solved.
It's great, but such a situation on the graphs of road networks is unlikely and solving the problem this way will not work. We’ll be trickier.
Take a couple of lines B1 and B2. From them we form the vector M in this way: M (i) = max [B1 (i), B2 (i)]. It is easy to prove that if for all rows i the value min (M (i)) is equal to the maximum value in column A, then, again, A is the center, and the min (M (i)) found is the radius.
If a pair of lines is not enough, you can take several lines, for example three: B1, B2 and B3, then M (i) = max [B1 (i), B2 (i), B3 (i)]. The peculiarity of the graphs of road networks is that many lines are not needed (it will be possible to keep within a dozen). This can be easily verified by experimenting on existing network graphs, downloading them from the Internet: link .
In the general case, and from the point of view of mathematics, this, of course, is not so. It is quite possible to construct a theoretical graph in which you have to use a lot of lines B (almost everything except A). It's just that it is impossible to build a real road network of this kind - there will not be enough money.
The last task remains. How to quickly find these successful lines B1, B2, etc. For graphs of real road networks, this is very simple and fast. These will be the vertices most distant from each other, but not necessarily the most distant (mathematically speaking, we do not need to find the diameter of the graph). We take any vertex, find the farthest one for it, again the farthest one for the new one, and so on, until a pair of peaks is the farthest for each other.
We got a pair of vertices B1 and B2. We find for the pair the vector M, as described above. The row in which we found min (M (i)) is a contender for the center, we denote it by A. If in column A the value min (M (i)) is the maximum, then the center and radius have already been found. If not, then the maximum value in column A corresponds to the distance to another vertex (not B1 and not B2). So, we got a new vertex B3 in the list to search for vector M. Alternatively, you can also search for the most distant vertex for B3, and if it is not B1 or B2, add it as B4. Thus, we increase the list of vertices B until the center and radius are found.
More strictly, with the algorithm and the necessary evidence, this algorithm is described in , the results of its use on some graphs of the US road networks are also presented there, and in the linkand the link is described less academically, but more clearly.
The most popular MKR search algorithms (Floyd-Warshall, for example) are described here . All of them are universal, and one of them - the Dijkstra algorithm with a binary heap - takes into account such a concept as a sparse graph. However, he also does not use the features of road networks.
We will use them on a completely different algorithm and on existing graphs we will get tens of times acceleration in comparison with Dijkstra's algorithm. We note right away that the peculiarity of this algorithm is that it searches for MKP, and at once everything is accurate (i.e., not approximately, not heuristically).
Consider the basic idea of the algorithm. Its essence is to remove the vertices of the graph without changing the shortest distances for the remaining points. If we do this, remembering at which points and at what distances the remote vertex was attached, we can delete all the points except one, and then collect them back into the graph, but with the distances already calculated.
Let's start with a simple one, from the top with degree 1. It can be removed in any case. No shortest paths pass through it, except for paths to the top itself, and they go exactly through that vertex to which the deleted vertex was attached.
Let A be a vertex with degree 2 and attached to the vertices B1 and B2. If the route B1-A-B2 is longer or equal to the edge B1-B2, no routes pass through point A, except for the routes to point A itself (all the others pass through B1-B2). So point A can be deleted. Otherwise, i.e. if B1-A-B2 is shorter than B1-B2 or there are no edges B1-B2 at all, the vertex A can be removed by setting the weight of the edge B1-B2 equal to the sum of the weights: | B1-A | + | A-B2 |. The route from A to other points passes either through B1 or through B2, if the distances for B1 and B2 are known, the distances from A are also easy to calculate.
Using the same principle, you can remove a vertex with any degree, replacing, as necessary, Bi-A-Bj with Bi-Bj. True, you need to understand that the greater the degree of the vertex, the more possible edges must be checked. For a vertex of degree n, this number is n (n-1) / 2.
Theoretically, in this way you can delete all the vertices in any graph, however, in the general case, we are in trouble with the increase in the number of edges. When removing vertices with degree n, the degree of vertices adjacent to the deleted one can: decrease by -1, not change, increase to n-2. It follows that when removing vertices of degree 3 or higher, the degree of the remaining vertices generally increases, the graph becomes less and less sparse and, in the end, removing the vertices will turn into a rather time-consuming task. The algorithm, in the general case, is extremely laborious and practically useless, but this is precisely in the general case.
Graphs of road networks have a unique feature of this kind: many peaks can be removed not only without growth, but also with a decrease in the degree of adjacent peaks. Moreover, if some vertex cannot be “successfully” deleted now, it can be “successfully” deleted later, after deleting some vertices adjacent to it.
Accordingly, we just need at each step to choose the correct vertices for deletion, starting with those that are deleted more “successfully”.
The algorithm itself can be viewed in more detail here . It describes how to remove a vertex, preserving the distances and paths between the remaining ones. This process is called disassembly. It describes how to restore the graph back later, adding vertices in the reverse order one at a time, and how to recalculate the MKP. This process is called assembly.
Also given are the results of using the algorithm on the US road network graphs link .
If we consider road networks not as graphs in general, but as graphs with some special properties, we can create and successfully apply more efficient algorithms for many practical problems.
For weighted graphs, many algorithms are proposed. For example, the popular Dijkstra algorithm for finding the shortest path from one vertex to another. All these algorithms have a common fundamental (for mathematics) feature - they are universal, i.e. can be successfully applied to graphs of any design. In particular, for each algorithm its complexity is known - it approximately corresponds to an increase in the execution time of the algorithm depending on the number of vertices of the graph. All this can be read in detail, for example, on Wikipedia.
Let's get back to practical tasks. Roads are represented by a weighted graph, but roads are not any graph. In other words, you cannot build a road network from any graph. Unlike a virtual graph as a mathematical abstraction, roads are built by people from real materials and cost quite a lot of money. Therefore, they are laid not as horrible as possible, but according to certain economic and practical rules.
We do not know these rules, however, working with road networks, it is quite possible to use algorithms that are effective for road graphs, although they are not suitable for graphs in a universal or mathematical sense. We consider here two such algorithms.
A few important concepts and conventions
1. We will use weighted undirected graphs with non-negative edge weights. In particular, the roads within the region (country) are just such a graph.
2. Shortest distance matrix (MKR) - its small and simple example can be found in many road atlases. This sign is usually called something like this: "the distance between the most important cities." It looks like a part of the matrix below or above the main diagonal (from the upper left to the lower right corner), because on the other side of the main diagonal there are exactly the same numbers, in other words, the element M (i, j) = M (j, i). This happens because the graph, as mathematicians say, is non-oriented. Rows and columns correspond to cities (vertices of the graph). In reality, such a table is much larger, since all villages and intersections enter the top of the graph, except for cities, but it is naturally impossible to print such a large table in the atlas.
First of all, we will continue (mentally) our table to the upper part, we will obtain a MKR symmetric with respect to the main diagonal, and hereinafter we will keep in mind precisely such a table. In this case, a column with a certain number is equal to a row with the same number and it does not matter which concept to use. We use both to cross them among themselves.
Our MKR can be: a) known in advance, because we calculated it by one of the methods of searching for MKR; b) we may not know MKP, but determine it line by line as necessary. Line by line - this means that for the required line, distances are calculated only from the vertex corresponding to it to the remaining vertices, for example, by the Dijkstra method.
3. A couple more concepts. The eccentricity of a given vertex is the distance from this vertex to the most distant from it. The radius of the graph is the smallest of the eccentricities of all the vertices. The center of the graph is a vertex whose eccentricity is equal to the radius.
How it looks in practice. The center of the road network is the city or intersection that is the least distant from all other points in the network. Radius is the maximum distance from this central node to the farthest.
4. Degree of vertex - the number of edges that are attached to the vertex.
Graphs of road networks, the average degree of all the peaks is in the region from 2 to 4. It is quite natural - it is difficult and expensive to build intersections with a large number of adjacent roads, then it is no less difficult to use such a road network. Graphs with a low average degree of peaks are called sparse, as we see, the graphs of road networks are just such.
Task 1. Search for the radius and center of the graph in the matrix of shortest distances
Note that a graph may have several centers, but we want to find any of them.
How is the problem solved in the general case? Full view of the FIBC. The maximum element in the row is searched (eccentricity of each vertex), and then the minimum is found from these maximum elements.
This is far from the fastest way. What is needed faster if, it would seem, the radius and center of the graph can be found once? For example, there are tasks and algorithms on them, where in the course of enumeration, the vertices are constantly “re-united” into groups, and the criterion for each group is its radius. In this case, the radius is recounted many times, and the speed of its search becomes an important parameter. How to find the radius faster?
The secret is that for graphs of road networks all the elements are not required to be viewed. In practice, just look at a very small part of all the lines.
Let's see how it turns out. Consider the values in one row of the MKP matrix, in other words, consider the distances from one vertex to all the others. It is easy to prove that the radius of the graph cannot be greater than the maximum value in this row, and cannot be less than the minimum value in this row. Mathematically speaking, we found the upper and lower bounds on the number, and if they match, we will find the number.
Suppose we found the values in only two rows A and B. Moreover, the maximum value in row A is equal to the minimum value in row B (this value will be at the intersection of column A and row B). It is easy to prove that A is the center of the graph, and the value found is its radius. The problem is solved.
It's great, but such a situation on the graphs of road networks is unlikely and solving the problem this way will not work. We’ll be trickier.
Take a couple of lines B1 and B2. From them we form the vector M in this way: M (i) = max [B1 (i), B2 (i)]. It is easy to prove that if for all rows i the value min (M (i)) is equal to the maximum value in column A, then, again, A is the center, and the min (M (i)) found is the radius.
If a pair of lines is not enough, you can take several lines, for example three: B1, B2 and B3, then M (i) = max [B1 (i), B2 (i), B3 (i)]. The peculiarity of the graphs of road networks is that many lines are not needed (it will be possible to keep within a dozen). This can be easily verified by experimenting on existing network graphs, downloading them from the Internet: link .
In the general case, and from the point of view of mathematics, this, of course, is not so. It is quite possible to construct a theoretical graph in which you have to use a lot of lines B (almost everything except A). It's just that it is impossible to build a real road network of this kind - there will not be enough money.
The last task remains. How to quickly find these successful lines B1, B2, etc. For graphs of real road networks, this is very simple and fast. These will be the vertices most distant from each other, but not necessarily the most distant (mathematically speaking, we do not need to find the diameter of the graph). We take any vertex, find the farthest one for it, again the farthest one for the new one, and so on, until a pair of peaks is the farthest for each other.
We got a pair of vertices B1 and B2. We find for the pair the vector M, as described above. The row in which we found min (M (i)) is a contender for the center, we denote it by A. If in column A the value min (M (i)) is the maximum, then the center and radius have already been found. If not, then the maximum value in column A corresponds to the distance to another vertex (not B1 and not B2). So, we got a new vertex B3 in the list to search for vector M. Alternatively, you can also search for the most distant vertex for B3, and if it is not B1 or B2, add it as B4. Thus, we increase the list of vertices B until the center and radius are found.
More strictly, with the algorithm and the necessary evidence, this algorithm is described in , the results of its use on some graphs of the US road networks are also presented there, and in the linkand the link is described less academically, but more clearly.
Problem 2. Search for the matrix of shortest distances
The most popular MKR search algorithms (Floyd-Warshall, for example) are described here . All of them are universal, and one of them - the Dijkstra algorithm with a binary heap - takes into account such a concept as a sparse graph. However, he also does not use the features of road networks.
We will use them on a completely different algorithm and on existing graphs we will get tens of times acceleration in comparison with Dijkstra's algorithm. We note right away that the peculiarity of this algorithm is that it searches for MKP, and at once everything is accurate (i.e., not approximately, not heuristically).
Consider the basic idea of the algorithm. Its essence is to remove the vertices of the graph without changing the shortest distances for the remaining points. If we do this, remembering at which points and at what distances the remote vertex was attached, we can delete all the points except one, and then collect them back into the graph, but with the distances already calculated.
Let's start with a simple one, from the top with degree 1. It can be removed in any case. No shortest paths pass through it, except for paths to the top itself, and they go exactly through that vertex to which the deleted vertex was attached.
Let A be a vertex with degree 2 and attached to the vertices B1 and B2. If the route B1-A-B2 is longer or equal to the edge B1-B2, no routes pass through point A, except for the routes to point A itself (all the others pass through B1-B2). So point A can be deleted. Otherwise, i.e. if B1-A-B2 is shorter than B1-B2 or there are no edges B1-B2 at all, the vertex A can be removed by setting the weight of the edge B1-B2 equal to the sum of the weights: | B1-A | + | A-B2 |. The route from A to other points passes either through B1 or through B2, if the distances for B1 and B2 are known, the distances from A are also easy to calculate.
Using the same principle, you can remove a vertex with any degree, replacing, as necessary, Bi-A-Bj with Bi-Bj. True, you need to understand that the greater the degree of the vertex, the more possible edges must be checked. For a vertex of degree n, this number is n (n-1) / 2.
Theoretically, in this way you can delete all the vertices in any graph, however, in the general case, we are in trouble with the increase in the number of edges. When removing vertices with degree n, the degree of vertices adjacent to the deleted one can: decrease by -1, not change, increase to n-2. It follows that when removing vertices of degree 3 or higher, the degree of the remaining vertices generally increases, the graph becomes less and less sparse and, in the end, removing the vertices will turn into a rather time-consuming task. The algorithm, in the general case, is extremely laborious and practically useless, but this is precisely in the general case.
Graphs of road networks have a unique feature of this kind: many peaks can be removed not only without growth, but also with a decrease in the degree of adjacent peaks. Moreover, if some vertex cannot be “successfully” deleted now, it can be “successfully” deleted later, after deleting some vertices adjacent to it.
Accordingly, we just need at each step to choose the correct vertices for deletion, starting with those that are deleted more “successfully”.
The algorithm itself can be viewed in more detail here . It describes how to remove a vertex, preserving the distances and paths between the remaining ones. This process is called disassembly. It describes how to restore the graph back later, adding vertices in the reverse order one at a time, and how to recalculate the MKP. This process is called assembly.
Also given are the results of using the algorithm on the US road network graphs link .
Conclusion
If we consider road networks not as graphs in general, but as graphs with some special properties, we can create and successfully apply more efficient algorithms for many practical problems.