Today, Elliotte Rusty Harold posted his doubts about the merits of closures in Java. Specifically, he asks "Why Hate the for Loop?":
I don’t know what it is some people have against for loops that they’re so eager to get rid of them. This isn’t the first or even the second time CS theorists have revolted against for loops (or equivalent).It's unfair to single Elliotte out for doubting the value of the lowly little closure. Mostly, he's complaining that he doesn't see the value in closures in Java, after having read Bruce Tate's recent column on the subject. (Bruce uses Ruby for illustration):
Listing 1. The simplest possible closureGiven this lackluster introduction to closures, it's hard for me to see their real value. This first comparison is, at best, a subtle nuance. The rest of the examples in Bruce's article on developerWorks are equally trivial, dubious and unenlightening.3.times {puts "Inside the times method."}
Results:
Inside the times method.
Inside the times method.
Inside the times method.times
is a method on the 3 object. It executes the code in the closure three times.{puts "Inside the times method."}
is the closure. It's an anonymous function that's passed into the times method and prints a static sentence. This code is tighter and simpler than the alternative with a for loop, shown in Listing 2:
Listing 2: Looping without closuresfor i in 1..3
puts "Inside the times method."
end
I'll ignore Elliotte's issues with Ruby style; it's not worth quibbling over. I'll also ignore the debate over the current syntax proposal for closures in Java, and the larger issue on whether or not closures belong in Java at this point. I don't have a horse in that race, and I honestly don't care how, when, or if it ever gets resolved.
Nevertheless, Elliotte does raise an important question: Why hate the for loop?
Here's a common example:
What's going on here? I've been programming for years, and I'm comfortable speed-reading this idiom; it's obviously a summation of a set of values in an array. But to actually read this block of code, I need to process about 30 tokens spread out over four lines. Sure, there's some syntactic sugar that could shave off a few tokens. But that's a lot of stuff to write and get correct just to do a simple summation.double sum = 0;
for (int i = 0; i < array.length; i++) {
sum += array[i];
}
Want proof? Here's the very next example from Elliotte's article, lifted verbatim:
Spot the bug? If that code compiles and gets past code review, it could take weeks to get a bug report, and weeks more to get a fix ready. And these are simple for loops. Imagine how much more complex these things get as the loop bodies get larger, and possibly nested. (If you're still not worried about bugs, think about the possible typos. Then think about how often you write loops like this.)String s = "";
for (int i = 0; i < args.length; i++) {
s += array[i];
}
If you could write a simple for loop in one line, with less repetition and fewer symbols, it should be both easier to read and easier to write. Because it's concise, there's much less likelihood that it will lead to bugs, and when bugs do occur, they should be easier to spot.
How would closures help here? Here's the first example, in Haskell:
OK, that was cheating.total = sum array
sum
doesn't use closures. It is defined in terms of a fold, which does take closures:Here's the second example, idiomatically and with closures:total = foldl (+) 0 array
Admittedly, using odd-looking functions with namess = concat array
s = foldr (++) [] array
foldl
and foldr
to explain closures may not make sense to programmers more familiar
with for loops. But it does highlight the key failing of for loops:
they conflate three separate kinds of operations -- filtering, reduction
and transformation.In the two for loops above, the goal was to take a list of values and reduce it to a single value. Functional programmers call these operations "folds". The way a fold works is by starting with an operation (a closure) and a seed value, and visiting the first element of the list. The operation is applied to the seed and the first value, producing a new seed. The fold then performs with the operation with the new seed and the next value in the list, all the way down to the last value, when the result of the last operation becomes the result of the fold.
Here's a demonstration:
Haskell provides a few fold functions;s = foldl (+) 0 [1, 2, 3]
= foldl (+) (0 + 1) [2, 3]
= foldl (+) 1 [2, 3]
= foldl (+) (1 + 2) [3]
= foldl (+) 3 [3]
= foldl (+) (3 + 3) []
= foldl (+) 6 []
= 6
foldl
starts examining the list from the beginning and works its way to the end, and foldr
,
which starts from the end and works its way backwards to the beginning.
Other variations exist, but these are the two fundamental ones.Of course, folds are a very primitive operation, and it would be very confusing if for loops were thrown out and replaced with various
foldl
and foldr
incantations. Instead, higher level operations like sum
, prod
and concat
are defined in terms of folds. And your code is then written in terms
of these higher-level reducing operations, making your code more
concise, easier to read, easier to write, and easier to understand.But not every for loop is a reduction. Consider this one:
This operation is a transformation, what functional programmers call a map:for (int i = 0; i < array.length; i++) {
array[i] *= 2;
}
The waynew_array = map (*2) array
map
works is by examining every element of a list, applying a function to
each element, and building up a new list with all the new values. (Some
languages do this in-place.) It's a much easier operation to
understand. sort
does something similar, in that it takes a list and returns (or modifies) a list.The third kind of for loop is a filter. Here's an example:
Here we have a very simple operation that's almost completely hidden by the needless complexity of using a for loop, and two independent indices. If filtering is a fundamental operation, it should be as easy as a fold or a map, and in fact, it is:int tmp[] = new int[nums.length];
int j = 0;
for (int i = 0; i < nums.length; i++) {
if ((nums[i] % 2) == 1) {
tmp[j] = nums[i];
j++;
}
}
This, at the heart of it all, is what's wrong with the for loop: it conflates (at least) three separate kinds of operations, and focuses on a minor detail: walking over a series of (index) values. But really,odds = filter (\i => (i `mod` 2) == 1) nums
odds = filter isOdd nums -- more idiomatic
fold
, map
and filter
are three different ways to process a list of values, and they need to
be handled differently. Using closures to pass in the body of a loop
also makes it easier to split the what from the how. I could write anonymous functions each time I traverse a list, or I could reuse a well known function (like isOdd
, (+)
or sqrt
).If closures aren't an esoteric feature, but deeply baked into a language and its standard library, there's no need to mess around with these low level operations. Instead, we can create higher level actions that say what they do, like
sum
and prod
.Furthermore, thinking in these terms makes it easier to think about more complex operations, like mapping a tree, filtering a vector, or folding a list into a hash.
Finally, Elliotte did some handwaving about parallelized execution on multiple cores, and how code like
3.times {...}
is "worse" than a for loop. Sadly, I think he's missing the point.
True, there are some computations that need to be serialized, and some
that may be parallelized. But figuring out which is which is a
difficult compiler optimization if your only tool is a for loop. If you
can split up the serial computations (i.e. foldl
and foldr
) from the possibly parallelized computations (i.e., map
and filter
), it's much easier for the compiler to figure out. Not only that, but you could explicitly ask for a serial or parallel version of map
if you know your data better than your compiler.Source:http://notes-on-haskell.blogspot.com/2007/02/whats-wrong-with-for-loop.html