Pessimal Algorithms and Computational Complexity Analysis
- Transfer
“Simplexity is the process by which nature achieves simple results in complex ways.” - Bruce Schiff
1. Introduction
Imagine the following problem: we have a table of n integer keys A 1 , A 2 , ..., A n , and the integer value of the X . We need to find the index of the number X in this table, but at the same time we are not in a hurry. In fact, we would like to do this as long as possible.
For this task, we could dwell on the trivial algorithm, namely, through all the A n in order to compare them with the X . But, it may happen that X = A 1, and the algorithm stops at the very first step. Thus, we see that the naive algorithm in the best case has a time complexity of O (1) . The question arises - can we improve (that is, worsen) this result?
Of course, we can greatly slow down this algorithm by adding empty cycles to it before the first check of the equality of X and A 1 . But, unfortunately, this method is not suitable for us, because any fool will notice that the algorithm is simply wasting time. Thus, we need to find an algorithm that still advances towards the goal, despite the lack of enthusiasm, or even the desire to ultimately reach it.
We can build an algorithm that satisfies this requirement, and which is much better (slower) than a naive algorithm if we sort Table A in ascending order. Then, we can use the procedure that implements a leisurely search, which is presented below:
The number of comparisons in this algorithm does not depend on X or A i and is given by the following recursive formula:
From which we obtain that,
thereby, we obtain an n- fold deterioration in compared to a naive algorithm. Please note that the lack of enthusiasm for the leisurely search algorithm is completely not obvious from its behavior, because it checks the equality of X and Ai every O (1) operations, never compares the same values, and stops when the sought value is found. Few search algorithms, honest or not, can compare with it in performance.
2. Generalization
The research procedure is an example of a completely new area of Computer Science - the design and analysis of leisurely algorithms. Intuitively, a leisurely algorithm that solves problem P is an algorithm that wastes time in such a tricky way that it allows you to deceive the average naive observer. We can write this statement in strict mathematical language, saying that A is a leisurely algorithm for problem P if and only if
, N is the set of naive observers, t is time, W (A, t, w, P) is a predicate that means “A wastes time t in a way w solving problem P”, and F (w, o) is a predicate, which means "the method w is complicated enough to deceive o" . Regarding the finiteness of the set N, we make no assumptions.
When studying unhurried algorithms, the performance of algorithm A is best described by its inefficiency or execution time in the best case - the minimum (as a function of n ) execution time A for all possible input data of length n . The complexity of the task P - maximum inefficiency among the leisurely algorithms that solve the problem of P . The algorithm is called pessimal.for the problem P , if the ineffectiveness And in the best case, asymptotically approaches the complexity of P .
Leisurely algorithms have many important practical applications. For example, the leisurely search algorithm is particularly applicable to real keys (real ones not in the mathematical sense, but in the sense that these are physical keys that open doors and cabinets). The leisurely search algorithm is the only known algorithm that accurately emulates the behavior of a bunch of such keys.
3. The problem of finding the path in interesting graphs
A table search can be considered as a special case of the following more general problem. Suppose we have a “maze” - an undirected graph G with n vertices and a vertex u marked as “input” . Our task is to find the path from the peak u to the peak v , marked as "exit"moving along the maze along one edge per unit of time. Following the classical analysis of the algorithms, most likely, we would immediately turn to one of the effective algorithms for finding the shortest path in a graph. However, suppose that this labyrinth is quite interesting inside, and that, in principle, we would be happy to spend several additional cycles of the algorithm in search of the vertex v. In fact, we hope ... no, we certainly wish the search took as much time as possible. Despite the fact that our sense of duty does not allow us to completely abandon our searches, we will not be able to long ignore the primitive needs of our human nature. Besides, what's wrong with a relaxed approach to solving a problem if we do what we need anyway? Moreover, they always told us “hurry - you make people laugh”, and, in any case, nobody needs to be perfect. Well, and stuff like that ...
Taking all this into account, we see that this task fits very well in the subject area of our theory.
In fact, this problem was widely studied in the Theory of Graphs, where it was called "the task of finding the most careless way . " An important section of the Mathematical Methods of Operations Research , which is called Anemic Programming , is completely devoted to ineffective methods for solving this problem. What do we know about its complexity? Previously, as shown by Wagner [Wagner] , if we do not know anything about finding v , the best time can be as little as O (1) : at every step, even at the very first, we risk stepping on vand fall out of the maze, no matter how we want to avoid it. However, as Homer [Homer] showed , if the graph is on a plane (or a flat globe), and we have an oracle that, like a compass, shows which direction the target is, we can delay the moment of arrival at the target until we visit most vertices of the graph. In fact, the time for which we can delay this moment is limited not so much by the complexity of the task, but by its monotony [1] . The inefficiency of the Homer algorithm is equal to Ω (n) , which is the lower limit of the complexity of the problem of finding the most careless way.
—————————
[1] Also known as dullness.
The leisurely search method and Homer’s careless path algorithm are both based on the same idea, known as the Feeble Descent Method . We also want to mention in passing one more important design paradigm of leisurely algorithms, which was described by Homer in the same work, and to which the author gave the vivid name Penelope's Trick . It is based on the use of the for loop, in which the step oscillates between positive and negative values at each iteration. Unfortunately, this technique (now called the Return Method ) has become so famous that even the most naive observer can easily recognize it. So, now it is of historical interest only.
4. Search back
Quite similar task is the systematic enumeration of all the n vertices of a connected graph the G . This problem has been well studied in the classical Theory of Algorithms , and is usually solved by well-known deep search [Vern1] or breadth-first search [Vern2] algorithms for which the temporal complexity in the best case belongs to Ω (n) .
For a long time, this was considered to be the upper limit of the complexity of the problem, until on October 4, 1984 at 14:17 the leisurely algorithm community started to shake with the discovery of the search algorithm, which belongs to Ω (n 2 ) by inefficiency on an important class of graphs. Backward search methodas he was called by his inventor will be described below. Like its predecessors, the algorithm assigns to the vertices v 1 , v 2 , ..., v n of the graph G the integers λ (v 1 ) , λ (v 2 ) , ..., λ (v n ) from 1 to n . The algorithm is represented by the bwfs recursive procedure . We assume that all the numbers λ (v) are initially zero. Recursion begins with a call to bwfs (v l , 1) .
We leave to the reader an instructive exercise to prove the correctness of the algorithm and show that its inefficiency really belongs to ϴ (n 2 ) for graphs that are straight lines. An interesting feature from the point of view of pessimality of the consumed memory in this case is that the recursion depth bwfs can reach ϴ (n 2 ) depending on the starting point of the graph. The inefficiency of this algorithm on ordinary graphs remains an unsolved problem, but, it seems, the algorithm is never faster than O (n√n) .
An interesting note, it is enough to remove one of the for loops so that from bwfsto get the search we know in depth. However, the order in which the vertices of the graph are traversed for the first time is very strange and confusing, and it does not at all resemble the order obtained by the depth-going algorithm.
By definition, backward numbering of graph vertices is the λ (v) values that were assigned to the vertices by this algorithm. As with numbering in depth and in width, this numbering has several interesting properties. Due to text size restrictions, we will only give a couple of them. If the faces of the graph are directed in such a way that the graph is acyclic, then either λ (head (e)) ≥ λ (tail (e))) for all faces e , or λ (head (e)) ≤ λ (tail (e) ) for all faces e . Moreover, if dIs the maximum degree of the vertices of the graph, then for any two adjacent vertices u and v it will be true that | λ (u) - λ (v) | ≤ d log min (λ (u), λ (v)) . These and other properties of backward numbering make it very important from the point of view of combinatorics.
5. Slow
No other task shows the power and elegance of leisurely algorithms like sorting n given numbers. This task has a long and rich history, the beginnings of which can be traced back a long time, certainly earlier than the formation of leisurely algorithms as a recognized discipline in the second half of last Wednesday. Thanks to the efforts of many industry pioneers, the inefficiency of sorting algorithms has grown steadily from the modest Ω (n log n) Merge Sort algorithm to Ω (n √n) Shell Sort with specific increments, to Ω (n 2 ) Sorts Bubble (Sort), and finally, to the cunning search algorithm belonging to Ω (n 3 )recently described by Bentley [Bentley] . (Apparently, it was first published by Steele, Woods, Finkel, Crispin, and Goodfellow [SWFCG] )
One of the most important results of modern complexity theory is the proof that the sorting problem can be solved in time Ω (n log (n) / (2 + e) ). This is the first problem found with non-polynomial complexity. The elegant algorithm that achieves such inefficiency is called Slowsort and is described below.
The Slow- Sort algorithm is an ideal example of the Multiply and Give Up paradigm , which is perhaps the most important paradigm for developing leisurely algorithms. Idea of strategyMultiply and Surrender is to replace this task with two or more subtasks, each of which is only slightly simpler than the original, and then continue to multiply the subtasks recursively, as long as possible. At some point, all the subtasks will be so simple that their solution can no longer be postponed, at that moment we will be forced to surrender. Experience shows that in most cases, by this time the total amount of wasted work will be much larger than if we used a more direct approach.
To better understand the strategy of Multiply and Surrender , let's take a step -by- step look at the development of the Slow - rate algorithm . We can separate the task of sorting the numbers A 1 , A 2 , ..., An in increasing order by two subtasks: (1) find the maximum of these numbers, (2) sort the rest. Subtask (1) can be further divided into subtasks: (1.1) find the maximum of the first [n / 2] elements, (1.2) find the maximum of the last [n / 2]elements, (1.3) find the largest of these two values. And finally, subtasks (1.1) and (1.2) can be solved by sorting the elements and selecting the last (read the maximum) element of the sorted array. Thus, we multiplied the primary task by three only slightly simpler subtasks (plus overhead): sort the first half, sort the second half, sort all the elements except one. We will continue to do this recursively until all propagated arrays contain at most one element. At this moment we have to give up.
The recursive definition of the Slow- Runtime runtime will look familiar to those who read section 3 . In general, it turns out T (n) = 2T (n / 2) + T (n - 1) .The Hamming distance between this formula and the well-known recursive Merge Sort formula T (n) = 2T (n / 2) + cn is only 5, but a simple argument about the finite differences shows that this is enough for the first equation there was no polynomially bounded solution. In fact, it can be shown that the solution satisfies the following inequalities [1] :
for any fixed e> 0 and two constants C a and C b . The idea of the proof (and we were told that our article would be printed only if it had at least one proof) is to assume that T (n) = C 1 n C 2ln n for some constants. Then
————————
[1] We use “log” for the base 2 logarithm and “ln” for the natural logarithms.
Assuming that C 2 = 1 / (2 ln 2) , we get that T (n) ≤ C b n log (n) / 2 , and assuming that C 2 = 1 / ((2 + e) ln 2 ) , we get that T (n) ≥ C a n log (n) / (2 + e) for substantially large n . (The constants C a and C b are simply false values ( orig. Fudge factor) to start the induction.) Following the spirit of unhurried algorithms, more detailed proof will be provided by the authors in the near future in the form of scanned images of punched cards with a proof code on EQN recorded on 7-track EBCDIC cassettes with an odd control bit.
As a practical application, it is obvious that Slow-rate is the most suitable algorithm, if suddenly your boss sends you to sort something in Paris. One of the good properties of this algorithm is that the number of inversions in A does not increase. So, in a certain sense (in the sense that you are in Paris, all expenses are covered), Slow- rate never takes the wrong steps.
6. Conclusions and unsolved problems
For a long time, Computer Science was only researching either the worst-case or medium-sized scenarios of the operation of algorithms. In this publication, for the first time we are trying to correct the obvious discrimination of learning the best scenarios, and we can only hope that others will follow us.
The analysis of Slow-rate led us to the following assumption, which we call the Increasing Hypothesis (UG) : If the complexity of the problem is O (gf) , where g and f are functions of the length of the input data, and f = o (g) , then the complexity of this problem is Θ (g f ) .
Augmented Increasing Hypothesis (ARC)claims that if the complexity of the problem is O (g + f) , then its complexity is Θ (gf) . Obviously, ARC implies UG .
The proof or refutation of HS is one of the main tasks of the Theory of Complexity . However, we are forced to end with the sad fact that it is most likely impossible to prove the UG due to the well-known incompleteness of Peano's arithmetic .
Thanks
We would like to thank Ed Lasowska and Lyle Ramshaw for their leisurely help.
References
[Bentley] D. L. Bentley, Programming Pearls, CACM, 27 (1984) 287-291
[Wagner] R. Wagner, Tannhäuser , (libretto in French and German), L'avant-scène, Paris, 1984
[Vern1] J. Verne, Journey to the Center of the Earth , (English translation), Heritage Press, 1966
[Vern2] J. Verne, Around the World in 80 Days , (English translation), International Collectors Library, 1956
[Homer] G. Homer, Illiad and Odyssey , translation by Samuel Butler, Encyclopedia Britannica, Chicago, 1952
[SWFCG] G. Style and others, Hacker Dictionary, Harper and Row, 1983
Note
The article was published in ACM SIGACT NEWS, 16 (3): 49-53, 1984. After that, it was redrawn from a bad photocopy by Gordon Lack in 2000. And at the end of 2013 it was translated into Russian by Valentin Simonov ( valyard ).
Unfortunately, during a careful reading, the backward search algorithm seemed to me inoperative. Perhaps, during reprinting, some details of this algorithm disappeared, and we will never know what exactly the authors had in mind. And perhaps this is a special test of attentiveness, which would be consistent with the general spirit of publication.
In any case, this text in some places made me laugh for a very long time. So, I hope that the translated text will give you as much pleasure.