kvl November 6, 2012 at 18:10

Learning to think and write in Erlang (using two combinatorial problems as an example)

From the sandbox

- ... Then I give him a muzzle ... No, you can not beat!
“The fact of the matter is that you cannot beat,” Panikovsky sighed hypocritically. - Bender does not allow.
I. Ilf, E. Petrov. Golden calf .

The callous Braga lived in a transparent vessel and was so
strong that even horror. She wasn’t just from the stomach - she
rushed straight from the mouth to the head and began to throw from side to side,
breaking mental supports and fortifications.
M. Uspensky. Where we do not have .

Perhaps everyone who first begins to study Erlang feels himself in the position of Shura Balaganov, who was forbidden to use the only accessible and understandable method: "you can’t beat ...". Erlang lacks such familiar to most modern languages concepts as reassigning a variable and, accordingly, accumulating the result in one variable. (In fairness, it should be noted that Erlang-style behavior of the “global multiple-changing variable” type can still be implemented. For this, there is a hash dictionary in each process that stores programmer-defined key-value pairs. There are built-in functions put (Key, Value), get (Key) and a few other auxiliary functions, but using such a dictionary in applications is considered a bad style and is recommended only in exceptional cases (http://www.erlang.org/doc/man/erlang.html\#put-2 )). As a result, iterations in a loop cannot be implemented with the usual accrual of values of an iterative variable. The accumulation of the result is carried out only through recursion, and the organization of cycles through the tail recursion. (Of course, iteration and accumulation of the result in a loop can be implemented through library functions for lists: foreach (Function, List), lists: foldl (Function, StartValue, List), lists: foldr (Function, StartValue, List) ( http : //www.erlang.org/doc/man/lists.html ) and their analogues for sets ( http://www.erlang.org/doc/man/sets.html , http://www.erlang.org /doc/man/ordsets.html , http://www.erlang.org/doc/man/gb_sets.html ) and arrays (http://www.erlang.org/doc/man/array.html ). But our goal is to learn how to write cycles, and not use ready-made solutions, so here we will refrain from using such libraries).

Thus, in Erlang you have to break the usual patterns of thinking and replace them with new patterns that are characteristic only for this programming language. Of course, the ideal remedy is a brainwash capable of breaking all “mental supports and strongholds”. But for us this is perhaps too radical a means, and we will go the other way.

In the life of St. Anthony the Great there is a story about one of his disciples. A disciple stood in the church and listened to St. Anthony reading the Psalter. As soon as the first verse of the first psalm sounded:
Блажен муж, который не ходит на совет нечестивых...
the student left the temple. Since then no one has seen him for almost 30 years, and when he reappeared in the church, Anthony the Great asked why he had left them so long and where he had disappeared. The disciple replied: “Father, I heard the words of the psalm, and retired to the desert to try to fulfill what is said in these words, that is, do not go to the council of wicked thoughts. " In other words, he learned a practical lesson from these words, and now has come to read further. Unfortunately, we do not have such a reserve of time, and our goals are not so exalted. But the basic concept can be adopted.
We will consider two standard combinatorial problems:

search for all possible permutations from a given set of N elements
search for all possible combinations from a given set of N elements

and we will analyze the various approaches and methods for solving them using the Erlang programming language, in order to understand and master some features of programming in this language using concrete examples.

All examples are collected in the combinat.erl module and are available at: https://github.com/raven29/combinat_erl.git

Two goals - two recursions

As a solution algorithm, we will use direct recursive enumeration. This means that at each step of the recursion is performed:

looping over all possible values from the list and adding each of these values to existing results;
recursive call with the transfer of a new list from which already used values are excluded.

In other words, the procedure consists of two stages: cyclic search at a given level of recursion and transition to the next level. Accordingly, these two goals require a double recursive call. (The algorithm is based on naive enumeration and, therefore, requires optimization. However, it is quite suitable for our purposes, because, firstly, due to its simplicity and clarity, it makes it easy to illustrate the methods and techniques used, and secondly, it is applied to both problems under consideration with minimal differences, which makes it possible to carry out appropriate comparisons and parallels).

Base implementation

The basic functionality is implemented in the combinat: permuts_out (List, Number) and combinat: combs_out (List, Number) functions, which print all permutations and all combinations of Number length from the elements of the List, respectively. The following is a permuts_out (List, Number) function that prints permutations. This function is called recursively twice: in line 6 - for looping through, in line 7 - to go to the next level of recursion. It is in this last call that the result is incremented in the variable [RemainH | Result], as well as the exclusion of the elements contained in this result from the general list passed to the next level of recursion. The fourth argument to List is the source list of items that is required to correctly calculate the remainder only in case of permutations.

permuts_out(List, Number) -> permuts_out(List, [], Number, List).
permuts_out(_Remain, Result, Number, _List) when length(Result) == Number ->
    io:format("~w~n", [Result]);
permuts_out([], _Result, _Number, _List) -> ok;
permuts_out([RemainH|RemainT], Result, Number, List) ->
    permuts_out(RemainT, Result, Number, List),
    permuts_out(List -- [RemainH|Result], [RemainH|Result], Number, List).

A similar function for combinations differs from the previous function only in a simpler rule for calculating the transmitted remainder and in the absence of the fourth argument List.

combs_out(List, Number) -> combs_out(List, [], Number).
combs_out(_Remain, Result, Number) when length(Result) == Number ->
    io:format("~w~n", [Result]);
combs_out([], _Result, _Number) -> ok;
combs_out([RemainH|RemainT], Result, Number) ->
    combs_out(RemainT, Result, Number),
    combs_out(RemainT, [RemainH|Result], Number).

Two recursions - two functions

For greater clarity, two recursive calls can be represented by two different functions. The endings in the names of functions correspond to their purposes: * _iteration - for iterating over the remainder list at a given recursion level, * _recursion - for moving to the next recursion level.

permuts_out_2(List, Number) -> permuts_out_iteration(List, [], Number, List).
permuts_out_iteration([], _Result, _Number, _List) -> ok;
permuts_out_iteration([RemainH|RemainT], Result, Number, List) ->
    permuts_out_iteration(RemainT, Result, Number, List),
    permuts_out_recursion(List -- [RemainH|Result], [RemainH|Result], Number, List).
permuts_out_recursion(_Remain, Result, Number, _List) when length(Result) == Number ->
    io:format("~w~n", [Result]);
permuts_out_recursion(Remain, Result, Number, List) ->
    permuts_out_iteration(Remain, Result, Number, List).
combs_out_2(List, Number) -> combs_out_iteration(List, [], Number, List).
combs_out_iteration([], _Result, _Number, _List) -> ok;
combs_out_iteration([RemainH|RemainT], Result, Number, List) ->
    combs_out_iteration(RemainT, Result, Number, List),
    combs_out_recursion(RemainT, [RemainH|Result], Number, List).
combs_out_recursion(_Remain, Result, Number, _List) when length(Result) == Number ->
    io:format("~w~n", [Result]);
combs_out_recursion(Remain, Result, Number, List) ->
    combs_out_iteration(Remain, Result, Number, List).

Perhaps this option can be considered antipattern, due to excessive cumbersomeness.

Present the result!

If you want to write a library function, then the output to the standard stream is of little use. You need to get the result and pass it in some form to the client code. Erlang has no global variables or static fields to accumulate the result. Instead, approaches specific to functional languages can be used:

return values in upward recursion
callback function
executing stream and storage stream

Consider each option in detail.

Bear there - bear back

So far, we have been doing something useful (if displaying the result on a computer screen can be considered useful) in the body of the function, going down deep into descending recursion. In the reverse movement, the result returned by the function was not used in any way. In other words, the upward recursion movement was "empty." With this approach, it is impossible to put together all the displayed values, because they are not connected in any way. More productive is the use of the result returned by the function when exiting recursion. In this case, the first call to the recursive function can return the entire cumulative result. The modification of the basic functions is given below and includes three points:

returning the result [Result] instead of output to the standard stream (line 3, 11)
at the bottom of the recursion, the initial value is returned - an empty list [] instead of the ok atom (line 4, 12)
accumulation of the result by summing the lists "++" instead of a sequential call (line 6, 14)

The permuts_res (List, Number) and combs_res (List, Number) functions return a list of lists containing, respectively, all permutations and combinations of the length Number.

permuts_res(List, Number) -> permuts_res(List, [], Number, List).
permuts_res(_Remain, Result, Number, _List) when length(Result) == Number ->
    [Result];
permuts_res([], _Result, _Number, _List) -> [];
permuts_res([RemainH|RemainT], Result, Number, List) ->
    permuts_res(RemainT, Result, Number, List) ++
    permuts_res(List -- [RemainH|Result], [RemainH|Result], Number, List).
combs_res(List, Number) -> combs_res(List, [], Number).
combs_res(_Remain, Result, Number) when length(Result) == Number ->
    [Result];
combs_res([], _Result, _Number) -> [];
combs_res([RemainH|RemainT], Result, Number) ->
    combs_res(RemainT, Result, Number) ++
    combs_res(RemainT, [RemainH|Result], Number).

And do what you like with him!

Sometimes it’s convenient not to accumulate the result in one collection variable, but to do something useful with each element immediately after its creation. This approach allows you to increase productivity and significantly reduce memory consumption. You can implement it using the callback function, which is passed as an additional argument. Relevant options are listed below.

permuts_clb(List, Number, Callback) -> permuts_clb(List, [], Number, List, Callback).
permuts_clb(_Remain, Result, Number, _List, Callback) when length(Result) == Number ->
    Callback(Result);
permuts_clb([], _Result, _Number, _List, _Callback) -> ok;
permuts_clb([RemainH|RemainT], Result, Number, List, Callback) ->
    permuts_clb(RemainT, Result, Number, List, Callback),
    permuts_clb(List -- [RemainH|Result], [RemainH|Result], Number, List, Callback).
combs_clb(List, Number, Callback) -> combs_clb(List, [], Number, Callback).
combs_clb(_Remain, Result, Number, Callback) when length(Result) == Number ->
    Callback(Result);
combs_clb([], _Result, _Number, _Callback) -> ok;
combs_clb([RemainH|RemainT], Result, Number, Callback) ->
    combs_clb(RemainT, Result, Number, Callback),
    combs_clb(RemainT, [RemainH|Result], Number, Callback).

In the Callback variable, the name of any function from one argument can be passed (with arity one - according to Erlang terminology). So, for example, you can call to print all permutations of the elements [1,2,3] 2:

combinat:permuts_clb([1,2,3], 2, fun(X)->io:format("~w~n",[X]) end).

A more meaningful application of the callback function is discussed in the next section.

Big Brother is watching you

Another way to implement accumulating the result of branching recursion in Erlang is to use two threads. One thread is the executor, our program is launched in it. Another thread is an observer, a recursive function passes the results to it. When the executing thread completes its work, the observing thread displays the total result. Important: as the executing thread, you cannot use the main thread (supervisor), in which the erl shell is running, because this thread is not destroyed after program execution. It continues to exist until the erl application is unloaded.

Below is the corresponding implementation. Line 3 sets the "exit on the ladder", which ensures the transmission of a message from the associated process even in the event of its normal completion. By default, the trap_exit flag is set to false, which means receiving a message from the associated process only in case of abnormal termination. In line 5, the executing thread starts (and simultaneously binds). In this thread, the permuts_clb (or combs_clb) function is launched, which receives the arguments List, Number, as well as the Callback callback function, which passes each unit result to the observer process:

fun(R)->Supervisor!R end

In line 6, the loop ([]) function starts with an empty initial value of the total result. This function listens for messages from the executing stream. When the next result is received, a loop (Total ++ [Result]) is recursively called (line 14) with an argument supplemented by the newly arrived result from the executing stream. Upon completion of the work of the executing thread, a “exit by the ladder” occurs: a special-type tuple (line 10) is passed to loop (), according to which, as a result of pattern matching, the overall result is displayed (line 11) and the connection with the executing stream is broken (line 12 ) Supervisor - pid of the observer thread, Worker - pid of the executor thread.

%% Function = permuts_clb | combs_clb
proc(Function, List, Number) ->
    process_flag(trap_exit, true),
    Supervisor = self(),
    spawn_link(combinat, Function, [List, Number, fun(R)->Supervisor!R end]),
    loop([]).
loop(Total) ->
    receive
    {'EXIT', Worker, normal} ->
        io:format("~w~n", [Total]),
        unlink(Worker);
    Result ->
        loop(Total ++ [Result])
    end.

The function is called with permuts_clb or combs_clb as the first argument, depending on the task being solved. For example, printing out all permutations from elements [1,2,3] by 2 is done through a call:

combinat:proc(permuts_clb, [1,2,3], 2).

There may be errors typical of a beginner. First, you cannot run loop () before spawn_link (). It would seem that it will be more reliable, since the listener function, which was launched before the execution thread started, will probably not miss a single message from this thread. But as a result, the process will hang on loop (), and the next line will never be called. Secondly, using the Supervisor variable to send a message to the observer thread is mandatory:

self()!R

does not work, since the self () function will be executed when called in the executing stream and, accordingly, will accept the pid of the executing stream. Thanks to w495 and EvilBlueBeaver for constructive criticism of these comments (and just for helping to figure it out).

And one more small remark: when experimenting with the proc () function, different oddities can happen, for example, a function can start to produce a result with a “delay”, i.e. at each call, return the result of the previous call. This may be due to the fact that we are launching the main thread as an observer thread. Therefore, after any failure, the next call to loop () will first process all messages from the last call, if any. In this sense, it will be more reliable to implement the listener stream also in a separate stream generated by the spawn () or spawn_link () function.

To understand is to enable

Some programming languages have a syntactic construct called " list comprehension ". It allows you to set an iterative traversal of the list in a compact and elegant form, as a result of which a new list is generated, each element of which is obtained from the original list by applying some function to each element of the original list. The design is based on the notation of mathematical set theory. Here, for example, in the list comprehension construct, the output of the squares of all integers from 1 to 9 looks like:

[X*X || X <- [1,2,3,4,5,6,7,8,9]].

In list comprehension, it is also possible to pass multiple lists and impose conditions. As an example, consider the output of the multiplication table from 1 to 9:

[io:format("~w * ~w = ~w~n", [I, J, I*J]) || 
    I <- [1,2,3,4,5,6,7,8,9], J <- [1,2,3,4,5,6,7,8,9]].

as well as a multiplication table, in which repeated results with permutation of factors are excluded:

[io:format("~w * ~w = ~w~n", [I, J, I*J]) || 
    I <- [1,2,3,4,5,6,7,8,9], J <- [1,2,3,4,5,6,7,8,9], I < J].

In Russian-language literature, “list comprehension” is translated as “inclusion of lists”, “list generation”. The main meaning of the translation “comprehension” is “to comprehend”, “to understand”. So, to understand in English is to include.

It should be noted that the comprehension construct exists not only for lists, but also for collections of other types. Erlang has list comprehension and binary comprehension .

The most elegant of its kind

In the list comprehension construct, iterative traversal of the list can be specified, as a result, the basic function will take the form:

permuts_comp(List, Number) -> permuts_comp(List, [], Number).
permuts_comp(_Remain, Result, Number) when length(Result) == Number ->
    io:format("~w~n", [Result]);
permuts_comp(Remain, Result, Number) ->
    [permuts_comp(Remain -- [R], [R] ++ Result, Number) || R <- Remain].

The permuts_comp function calls itself recursively from list comprehension.
This is perhaps the most elegant permutation function of all possible.

If you can’t, but really want to ...

The generalization of the previous result to the function of combinations, unfortunately, is not so obvious. The fact is that in the list comprehension in this case it is necessary to transfer not the entire list, but only the remainder determined by the number from the previous call. If there was an array instead of lists, it would be possible to easily calculate the desired index. But there are no arrays in the basic Erlang types. Either use the array library or organize the array manually.
This turns out to be quite simple, and the corresponding implementation is presented below. From the original List, build a list of ListIndexed tuples, each element of which contains elements of the original list and an integer index (line 2). The function lists: zip (List1, List2) from the standard lists module is suitable for such a conversion. When displaying the result, we use the lists: unzip (ListIndexed) function, which returns the indexed list with its original form without indexes (line 5). And most importantly - in list comprehension, you can now easily specify the required restriction on the indices included in the iteration (line 11).

combs_comp(List, Number) ->
    ListIndexed = lists:zip(List, lists:seq(1, length(List))),
    combs_comp(ListIndexed, [], Number).
combs_comp(_Remain, Result, Number) when length(Result) == Number ->
    {ResultValue, _I} = lists:unzip(Result),
    io:format("~w~n", [ResultValue]);
combs_comp(Remain, [], Number) ->
    [combs_comp(Remain -- [R], [R], Number) || R <- Remain];
combs_comp(Remain, [{HValue,HIndex}|T], Number) ->
    [combs_comp(Remain -- [{R,I}], [{R,I}] ++ [{HValue,HIndex}|T], Number) ||
    {R,I} <- Remain, I > HIndex].

It looks somewhat awkward, and this is the only program among our examples in which I had to resort to the library functions lists: zip (List1, List2), lists: unzip (ListTuple), lists: seq (StartValue, Length). This attempt can also be considered an example of antipattern, because for the task at hand, it would be more consistent to use the array module, but this will be a different story ...

Tags: