Gray codes and enumeration tasks

    This article will show a mathematical approach to compiling algorithms using the following questions and tasks as an example:
    • Binary codes of Gray. Their existence. Enumeration of subsets of a given set in the order of minimal change.
    • Existence and implementation of enumeration of subsets of k elements in the order of minimal change.

    So let's get started.

    Gray Binary Code


    A gray binary code of order n is a sequence of all imagen- bit codes in which any two neighboring codes differ in exactly one bit.

    An example of Gray codes of order 2:
    • 00
    • 01
    • eleven
    • 10


    It is easy to notice that such a sequence is not the only one (it can at least be reversed). Therefore, let's look at the existence of Gray binary codes of other orders and at the same time select one kind of such sequences for further work.

    Existence


    The existence of Gray codes is easily proved by induction.
    Base - image- an empty sequence.
    Transition. Let be imagea sequence of Gray codes of order image. image- a sequence imagewritten in reverse order. Then the sequence is imageobtained by assigning zero to the left of each code in the sequence imageand units to the left of each code imageand their subsequent union, i.e. image. It is easy to verify that the property of Gray codes in the sequence imageconstructed in this way is not violated. image

    Gray codes obtained in this way are called reflective or symmetrically reflected.. In the future, we will consider the gray reflective binary codes.

    As an exercise write out symmetrically reflected Gray code order 3 of the proof given in formula image.

    Now let's move on to the application of Gray codes to solve some problems of combinatorics.

    Enumeration of subsets of a given set in the order of minimal change


    Consider the following problem: “Given a set of n elements. It is required to deduce all of its subsets in such a way that each subsequent subset is obtained from the previous one by deleting or adding exactly one element (that is, in the order of minimal change). ”

    We assume that the elements of our set are numbered from 1 to n . Then, any set of elements of the set can be associated with an n- bit binary code: if the i- th element of the set is included in the set, then the i- th bit of the code is one, otherwise zero. Thus, enumeration of all subsets is reduced to enumeration of binary codes of order n, and enumerating the subsets in the order of minimal change - to enumerating the binary Gray codes.

    The idea of ​​a recursive enumeration algorithm for Gray codes certainly follows from the proof of their existence, namely, the recurrence relation given in it image.

    imageActually, as stated in the proof, the Gray code of order n is obtained from the Gray code of order n-1 by assigning zero to the left (this corresponds to calling the function and assigning it in front of it) and the reverse Gray code of order n-1 by assigning one to the left (this corresponds to calling the function and assignment in front of him).

    We proceed to assess the complexity of this algorithm. Obviously, the function call treeimageis a complete binary tree of height n . Therefore, the total number of function calls during the generation of Gray codes of order n is none other than image, which is equal to the number of all possible n- bit codes. Therefore, the algorithm is optimal.

    As an exercise, the reader is invited to analyze the following, shorter recursive search algorithm.
    Example


    I would also like to say about the existence of an iterative search algorithm and other useful properties of Gray codes that are not considered in this article, and which can be learned from other sources, for example Wikipedia.

    Now let's move on to the last and most interesting task.


    Enumeration of subsets of k elements in the order of minimal change.


    Let's start with the statement of the problem: “A set of n elements is given. It is required to deduce all its subsets consisting of exactly k elements in such a way that each subsequent subset is obtained from the previous one by replacing exactly one element (that is, in the order of minimal change). ”

    Here the idea of ​​the algorithm from the previous problem immediately comes to mind, but we will only print those Gray codes in which exactly k units. In this case, the question arises of the correctness of this algorithm: no one guarantees us that the neighboring codes in a sequence imagein which only binary codes with the number of units equal to k are left, are arranged in the order of minimal change - with a difference of exactly two digits - obtained from each other by replacing one element. Therefore, we proceed to a further study of the properties of symmetrically-reflected Gray codes.

    We introduce some notation:
    • imageIs the string obtained by concatenating the character x ( m times) and concatenating the character y ( k times). Example image.
    • image- the sequence of all n- bit codes with k units, in which the codes are in the order in which they are located in the sequence image.
    • image- the sequence of all n- bit codes with k units, in which the codes are in the order in which they are located in the sequence image.


    Lemma

    The first binary code with imageunits in the sequence imagehas the form image, and the last - imageor image, if image.
    Evidence
    We prove by induction on imageand image.
    Base - image- true (it is easy to verify the above formulas for the two codes 0 and 1).
    Transition. Let the formulas be true for imageanyone image. Let us recall the formula of the proof of the existence of the Gray codes: image. From this we get that the first code with imageunits in the sequence imagewith the zero assigned to the left is the first code with imageunits in the sequence image. When imagewe get the only code with imageunit, for which the formula is also true. With the imagelast code with imageunits in the sequence, imagethere is the first code with the imageunit and the unit assigned to the left in the sequence image. Atimage the formula is also true. image

    Theorem

    The adjacent binary codes in the sequence imagediffer exactly in two digits.
    Evidence
    We will also prove by induction on imageand image.
    Base - image- obviously (there is only one possible code in the sequence).
    Transition. Let the theorem hold for imageand image. With imageor imagewe have only one code in the sequence image. When imageneighboring words are completely in imageor in imageby the induction hypothesis, they differ exactly in two categories. It remains to check one pair of neighboring codes in the imagelast code with imageunits in imageand the first code with imageunits in image(the last with imageunit in imagewith unit assigned to the left). By the lemma, for image, they have the form imageand, imagerespectively, and forimage- imageand image. It is easy to see that they differ in exactly two categories. image

    Now we can state the remarkable consequences of the theorem.
    • The naive algorithm proposed earlier is correct, that is, if imageonly binary codes in which exactly k units are left in the sequence , then they will be arranged in the order of minimal change (each following is obtained from the previous one by replacing one element).
    • The sequence imageis given by the following recurrence relation image. This can be verified by induction.
    • The sequence imageis given by the following recurrence relation image. Also verified by induction.


    These consequences can and should be used to write a program for recursively enumerating subsets of k elements in the order of minimal change.
    I would like to draw attention to the fact that binary code output also means filling in the remaining bits with ones or zeros, depending on the values ​​of u and v .

    We proceed to estimate the complexity of the presented algorithm. Note that when deepening into recursion, the value of u always decreases. That is, we will make no more than n recesses to achieve some binary code. There are a total of such codes . We obtain the resulting asymptotic behavior . This is a rather rough estimate, but it is in every way better image, especially whenk tending to n : the algorithm is almost linear.

    It is easy to notice that the call tree will be binary, but not complete. Let us draw a parallel of complexity assessment in terms of trees: each leaf corresponds to receiving some binary code, and the distance to the leaf in the tree does not exceed n . The estimate tells us that we simply summed up the upper estimate of the lengths - n - of all the paths to the sheets. But after all, many of these paths have a length less than n , and even more paths intersect with each other. This should lead us to the idea that the asymptotic behavior of the presented algorithm is actually much better.

    An exact estimate of the asymptotics is far from a trivial fact, which can be found in the book by E. Reingold, J. Nievergelt and N. Deo “Combinatorial Algorithms. Theory and practice ”, or try to prove it yourself.

    On this I would like to end the article. Thank you and see you soon.

    Also popular now: