What's Wrong with the For Loop

Closures in Java are a hot topic of late. A few really smart people are drafting a proposal to add closures to a future version of the language. However, the proposed syntax and the linguistic addition are getting a lot of push back from many Java programmers.

Today, Elliotte Rusty Harold posted his doubts about the merits of closures in Java. Specifically, he asks "Why Hate the for Loop?":

I donâ€™t know what it is some people have against for loops that theyâ€™re so eager to get rid of them. This isnâ€™t the first or even the second time CS theorists have revolted against for loops (or equivalent).

It's unfair to single Elliotte out for doubting the value of the lowly little closure. Mostly, he's complaining that he doesn't see the value in closures in Java, after having read Bruce Tate's recent column on the subject. (Bruce uses Ruby for illustration):

Listing 1. The simplest possible closure
3.times {puts "Inside the times method."}

Results:
Inside the times method.
Inside the times method.
Inside the times method.
times is a method on the 3 object. It executes the code in the closure three times. {puts "Inside the times method."} is the closure. It's an anonymous function that's passed into the times method and prints a static sentence. This code is tighter and simpler than the alternative with a for loop, shown in Listing 2:

Listing 2: Looping without closures
for i in 1..3 
  puts "Inside the times method."
end

Given this lackluster introduction to closures, it's hard for me to see their real value. This first comparison is, at best, a subtle nuance. The rest of the examples in Bruce's article on developerWorks are equally trivial, dubious and unenlightening.

I'll ignore Elliotte's issues with Ruby style; it's not worth quibbling over. I'll also ignore the debate over the current syntax proposal for closures in Java, and the larger issue on whether or not closures belong in Java at this point. I don't have a horse in that race, and I honestly don't care how, when, or if it ever gets resolved.

Nevertheless, Elliotte does raise an important question: Why hate the for loop?

Here's a common example:

double sum = 0;
for (int i = 0; i < array.length; i++) {
  sum += array[i];
}

What's going on here? I've been programming for years, and I'm comfortable speed-reading this idiom; it's obviously a summation of a set of values in an array. But to actually read this block of code, I need to process about 30 tokens spread out over four lines. Sure, there's some syntactic sugar that could shave off a few tokens. But that's a lot of stuff to write and get correct just to do a simple summation.

Want proof? Here's the very next example from Elliotte's article, lifted verbatim:

String s = "";
for (int i = 0; i < args.length; i++) {
  s += array[i];
}

Spot the bug? If that code compiles and gets past code review, it could take weeks to get a bug report, and weeks more to get a fix ready. And these are simple for loops. Imagine how much more complex these things get as the loop bodies get larger, and possibly nested. (If you're still not worried about bugs, think about the possible typos. Then think about how often you write loops like this.)

If you could write a simple for loop in one line, with less repetition and fewer symbols, it should be both easier to read and easier to write. Because it's concise, there's much less likelihood that it will lead to bugs, and when bugs do occur, they should be easier to spot.

How would closures help here? Here's the first example, in Haskell:

total = sum array

OK, that was cheating. sum doesn't use closures. It is defined in terms of a fold, which does take closures:

total = foldl (+) 0 array

Here's the second example, idiomatically and with closures:

s = concat array
s = foldr (++) [] array

Admittedly, using odd-looking functions with names foldl and foldr to explain closures may not make sense to programmers more familiar with for loops. But it does highlight the key failing of for loops: they conflate three separate kinds of operations -- filtering, reduction and transformation.

In the two for loops above, the goal was to take a list of values and reduce it to a single value. Functional programmers call these operations "folds". The way a fold works is by starting with an operation (a closure) and a seed value, and visiting the first element of the list. The operation is applied to the seed and the first value, producing a new seed. The fold then performs with the operation with the new seed and the next value in the list, all the way down to the last value, when the result of the last operation becomes the result of the fold.

Here's a demonstration:

s = foldl (+) 0       [1, 2, 3]
  = foldl (+) (0 + 1) [2, 3]
  = foldl (+) 1       [2, 3]
  = foldl (+) (1 + 2) [3]
  = foldl (+) 3       [3]
  = foldl (+) (3 + 3) []
  = foldl (+) 6       []
  = 6

Haskell provides a few fold functions; foldl starts examining the list from the beginning and works its way to the end, and foldr, which starts from the end and works its way backwards to the beginning. Other variations exist, but these are the two fundamental ones.

Of course, folds are a very primitive operation, and it would be very confusing if for loops were thrown out and replaced with various foldl and foldr incantations. Instead, higher level operations like sum, prod and concat are defined in terms of folds. And your code is then written in terms of these higher-level reducing operations, making your code more concise, easier to read, easier to write, and easier to understand.

But not every for loop is a reduction. Consider this one:

for (int i = 0; i < array.length; i++) {
  array[i] *= 2;
}

This operation is a transformation, what functional programmers call a map:

new_array = map (*2) array

The way map works is by examining every element of a list, applying a function to each element, and building up a new list with all the new values. (Some languages do this in-place.) It's a much easier operation to understand. sort does something similar, in that it takes a list and returns (or modifies) a list.

The third kind of for loop is a filter. Here's an example:

int tmp[] = new int[nums.length];
int j = 0;
for (int i = 0; i < nums.length; i++) {
  if ((nums[i] % 2) == 1) {
    tmp[j] = nums[i];
    j++;
  }
}

Here we have a very simple operation that's almost completely hidden by the needless complexity of using a for loop, and two independent indices. If filtering is a fundamental operation, it should be as easy as a fold or a map, and in fact, it is:

odds = filter (\i => (i `mod` 2) == 1) nums
odds = filter isOdd nums -- more idiomatic

This, at the heart of it all, is what's wrong with the for loop: it conflates (at least) three separate kinds of operations, and focuses on a minor detail: walking over a series of (index) values. But really, fold, map and filter are three different ways to process a list of values, and they need to be handled differently. Using closures to pass in the body of a loop also makes it easier to split the what from the how. I could write anonymous functions each time I traverse a list, or I could reuse a well known function (like isOdd, (+) or sqrt).

If closures aren't an esoteric feature, but deeply baked into a language and its standard library, there's no need to mess around with these low level operations. Instead, we can create higher level actions that say what they do, like sum and prod.

Furthermore, thinking in these terms makes it easier to think about more complex operations, like mapping a tree, filtering a vector, or folding a list into a hash.

Finally, Elliotte did some handwaving about parallelized execution on multiple cores, and how code like 3.times {...} is "worse" than a for loop. Sadly, I think he's missing the point. True, there are some computations that need to be serialized, and some that may be parallelized. But figuring out which is which is a difficult compiler optimization if your only tool is a for loop. If you can split up the serial computations (i.e. foldl and foldr) from the possibly parallelized computations (i.e., map and filter), it's much easier for the compiler to figure out. Not only that, but you could explicitly ask for a serial or parallel version of map if you know your data better than your compiler.

Source:http://notes-on-haskell.blogspot.com/2007/02/whats-wrong-with-for-loop.html

What's Wrong with the For Loop

RELATED

0 COMMENT

RANDOM FUN

Be careful with mathematician

ABOUT

HOW IT WORKS

FOLLOW US

FEEDBACK