artemgapchenko March 21, 2012 at 16:27

What is wrong with for loops?

Transfer

The possible occurrence of closures in Java has become a hot topic of discussion. A proposal is being prepared to add closures to the upcoming version of the language, but the proposed syntax, coupled with the complexity of the language, has been criticized by many Java programmers.

Today, Elliott Rusty Harold posted his doubts about possible changes. One of the main questions he asked was: “Why do you hate the for loop” ( “Why Hate the for Loop?” )?

I don't know what makes people so passionately dream of getting rid of for loops. This is not the first or even the second time that theorists from the world of computer science rebel against them.

It would be wrong to think that only Elliott casts doubt on the need for poor and miserable closures. Mostly, he complains that after reading a recent article by Bruce Tate he did not see any value in the possible introduction of closures in Java (the latter uses Ruby for examples):

Listing 1. The simplest closure.
3.times {puts "Inside the times method."}
Results:
Inside the times method.
Inside the times method.
Inside the times method.

Here times is the method of object 3; it executes the closure code three times.
{puts "Inside the times method."}
and there is a closure. This is an anonymous function that, when passed to the times method, prints the string “Inside the times method.” This code is much clearer and simpler than its alternative using the for loop:

Listing 2. A loop without closures.
for i in 1..3
 puts "Inside the times method."
end

Even I, having seen such a pale and lifeless introduction to closures, would hardly have seen their true meaning. The first thing that comes to mind at the sight of these examples is: “Yes, perhaps there is some subtle shade here.” The remaining examples from Bruce's article are equally trivial, unclear, and do not shed any light on the subject of discussion.

I won’t discuss Elliot’s complaints about Ruby code - this topic is not worth a damn. I will also bypass the debate about the syntax proposed for closures in Java, and the question of whether they fit into this language at the current stage of its development. I am not interested in this issue, and, frankly, it doesn’t matter to me when and how these problems will be resolved, or whether they will be resolved at all.

However, Elliott raises a really important question: “Why do you hate the for loop”?

Here is a common example:

double sum = 0;
for (int i = 0; i < array.length; i++) {
    sum += array[i];
}

What is he doing? I have been involved in programming for many years, and understanding a piece of this code will not take me much time - this is an ordinary summation of array elements. However, in order to read this example, I have to sequentially scan about thirty tokens scattered across four lines. Of course, when writing this code, you could use syntactic sugar, which would reduce its volume. However, the bottom line is that there are many places where you can potentially make a mistake when writing a regular summation.

Would you like proof? Please give you the following example from an Elliott article:

String s = "";
for (int i = 0; i < args.length; i++) {
    s += array[i];
}

Notice a bug? Even if this code compiles and passes the audit, it may take weeks before the first error message is received and several more before the fix is released. And these are just simple for loops. Now imagine how much more complicated the work becomes in the case of highly expanded, and perhaps nested, loops. (If you're still not worried about bugs, think about common typos. And then think about how often you write such loops.)

If you had the opportunity to write a simple for loop in one line, with fewer characters, it was would be much easier to read. Because a shorter form of writing leads to a much lower probability of errors in the code, and when they do appear, it’s (basically) much easier to catch.

How could snapping help in this situation? Here is the first Haskell example:

total = sum array

Well, all right, that wasn’t quite fair. The sum function does not use closures - it is defined through convolution (translator's note: the English version is fold), which, in turn, uses closures.

total = foldl (+) 0 array

Here is a second example of applying closures:

s = concat array
s = foldr (++) [] array

Perhaps an example using the strange-looking functions foldl and foldr to explain closures is not so obvious to programmers familiar with cyclic constructions only. However, he emphasizes the main weakness of for loops - they are trying to combine three different operations - filtering, folding, and transforming.

The two examples above show how you can take a list and collapse it into one element. In the world of functional programming, such operations are called convolutions. For its work, the convolution takes a function (closure), the initial value (approx. Translator: hereinafter - “battery”) and visits the first element of the list. A closure is applied to the battery and the first element of the list, the result is put into the battery. Then the convolution applies the closure to the updated value of the battery and the second element of the list. This process continues until the end of the list, when the last value of the battery is the result of applying convolution.

Here is a visual explanation:

s = foldl (+) 0       [1, 2, 3]
 = foldl (+) (0 + 1) [2, 3]
 = foldl (+) 1       [2, 3]
 = foldl (+) (1 + 2) [3]
 = foldl (+) 3       [3]
 = foldl (+) (3 + 3) []
 = foldl (+) 6       []
 = 6

Haskell provides several convolution functions: foldl starts processing the list from its first element, and passes sequentially to the last element; foldr, by contrast, starts at the last element, and moves toward the first element. There are other options, but these two are basic.

Of course, convolutions are very primitive operations, and it would be very unreasonable to throw the for loops overboard and replace them with various fold-shaped “spells”. Instead, higher-level operations, such as sum, prod, and concat, are defined in terms of convolutions. You, in turn, program already in terms of these high-level operations, getting more compressed code, a code that is much easier to read, write and understand.

However, not every cycle is a folding of the list into one element. Consider the following example:

for (int i = 0; i < array.length; i++) {
    array[i] *= 2;
}

In this case, we are dealing with conversion, in functional languages this is called map:

new_array = map (*2) array

The map function visits each element of the list, applies the closure passed to it [element], and creates a new list from the values returned by the closure. (Instead of creating a new list, some languages replace the old elements of the original list with new ones). Such an operation is much easier to understand. sort does something similar, in the sense that it takes a list and returns a new list (or modifies the old one).

The third kind of for loops is filter loops. Consider an example:

int tmp[] = new int [nums.length];
int j = 0;
for (int i = 0; i < nums.length; i++) {
    if ((nums[i] % 2) == 1) {
        tmp[j] = nums[i];
        j++;
    }
}

Very simple code, the meaning of which, however, is almost completely buried under the excessive complexity that led to the use of the for loop and two indexes. If filtering is the same fundamental operation as fold and map, applying it should be just as simple, and, generally speaking, the way it is:

odds = filter (\i -> (i `mod` 2) == 1) nums
odds = filter odd nums  -- более верно

Here, in fact, all the shortcomings of the for loop: it combines (at least) three varieties of operations, and focuses on a secondary detail - passing through a sequence (indices) of values. But in fact, collapsing, converting, and filtering are three different ways to process a list, and they should be perceived differently. Using closures in the body of the loop allows you to more clearly separate what is from how . I could write anonymous functions whenever I need to process a list, or use functions written before me (like odd, (+) or sqrt).

If closures are not an esoteric tool, but deeply embedded in the language and its standard library, we do not need to bother with these low-level operations (translator's note: here we mean map, fold and filter). Instead, we can create higher-level operations with speaking names, such as sum or product.

In addition, working with similar terms makes it much easier to understand more complex operations, such as mapping a tree, filtering a vector, or collapsing a list into a hash (translator's note: it’s not clear what I had in mind the author under the words “folding a list into a hash” - folding the list into a single value, or the function of obtaining a hash table from the list of tuples, available in the standard library).

In the end, Elliott shook his hands on the topic of parallel code execution on several cores, and complained that the code

3.times {...}

it will almost certainly be worse parallel than a similar for loop. It seems to me that he overlooks one circumstance. Yes, there are calculations that must be performed sequentially, there are those that can be performed in parallel. But separating one from the other is a difficult task for the compiler if the only tool you use is the for loop. If you can separate sequential calculations (that is, foldl and foldr) and (possibly) parallel (map and filter), then, you will greatly simplify the compiler task. Moreover, you can even explicitly request a serial or parallel version of map if you know the nature of the data you process better than the compiler.

From translator

In the text, references to closures often slip. If I understand the definition of closures correctly , then I must say that the author confuses anonymous functions and closures - in the examples given by him there are no functions that work with environment variables that are not listed in the parameter list.

Tags:

What is wrong with for loops?

From translator

Also popular now: