14

I have recently started playing with Java 8, having done bits and pieces in Haskell/Scala before. I am trying to play with higher-order functions in Java such as map or forEach, and I am struggling to understand what motivation it was to push everything towards Stream ideology. I understand it gives nice, general abstraction, it is supposed to be lazy, but let's consider a very simple, common example:

list.map(x -> do_sth(x));

very common idiom, expecting this to return a List<T>. Now, in Java 8, I need to do sth of this sort:

list.stream().map(x -> doSth(x)).collect(Collectors.toList())

Now, as far I see this, the stream will not apply the map until collect is called, so there will be one pass through the collection under the hood. What I can't see is why those common use cases for maps, lists such as map.toList(), list.groupBy() would not be added to corresponding interfaces? Is there an underlying design decision I am missing here?

maksimov
  • 5,792
  • 1
  • 30
  • 38
Bober02
  • 15,034
  • 31
  • 92
  • 178

1 Answers1

24

A handful of new methods have been added directly to various collections that perform mutative operations eagerly on those collections. For example, to run a function over each element of a list, replacing the original elements with the return values, use List.replaceAll(UnaryOperator). Other examples of this are Collection.removeIf(Predicate), List.sort(), and Map.replaceAll(BiFunction).

By contrast, there are a bunch of new methods added to Stream such as filter, map, skip, limit, sorted, distinct, etc. Most of these are lazy and they do not mutate the source, instead passing elements downstream. We did consider adding these directly to the collections classes. At this point several questions arose. How do we distinguish the eager, mutative operations from the lazy, stream-producting operations? Overloading is difficult, because they'd have different return types, so they'd have to have different names. How would such operations be chained? The eager operations would have to generate collections to store intermediate results, which is potentially quite expensive. The resulting collections API would have a confusing mixture of eager, mutative and lazy, non-mutating methods.

A second order consideration is potential incompatibilities with adding default methods. The biggest risk with adding default methods to an interface is a name clash with an existing method on an implementation of that interface. If a method with the same name and arguments (often no arguments) has a different return type, that's an unavoidable incompatibility. For that reason we've been fairly reluctant to add large numbers of default methods.

For these reasons, we decided to keep the lazy, non-mutating methods all within the stream API, at the expense of requiring extra method calls stream() and collect() to bridge between collections and streams. For a few common cases we added eager, mutating calls directly to the collections interfaces, such as those I listed above.

See lambdafaq.org for further discussion.

Michel
  • 1,085
  • 13
  • 24
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 1
    What I don't understand and maybe you can explain -- why this, .collect(Collectors.toList()), when it could have a toList() method built in? It just seems verbose and unwieldy to me. – The Coordinator Feb 09 '14 at 07:46
  • 5
    If you have `toList`, you'll probably want to `toSet` and `toMap` as well. But you would still want the `collect` method, because it's so flexible. So now some collection operations would have their own dedicated convenience methods, while for others you would need to call `collect`. Maybe the convenience would be worth the untidiness and inconsistency, but the designers thought not. – Maurice Naftalin Feb 11 '14 at 16:00
  • I saw a lot of codes are forced to be written to: stream()...collect(Collectors.toList()), instead of stream()...toList() because of some silly thoughts of jdk designer. it's disgusting. "toList", "toSet", "toMap" are the most used termination operations. they should be added to the Stream API – user_3380739 Apr 14 '17 at 17:48
  • @Stuart Marks Where I can read about these stream() and collect() methods? Or can you please explain what they actually do? – Michel Nov 09 '17 at 08:18
  • @Michel See the official java.util.stream [package documentation](https://docs.oracle.com/javase/9/docs/api/java/util/stream/package-summary.html) and the Stream interface within. Also there are many tutorials available on the web. – Stuart Marks Nov 09 '17 at 21:35