17

I've noticed many functionalities exposed in Stream are apparently duplicated in Collectors, such as Stream.map(Foo::bar) versus Collectors.mapping(Foo::bar, ...), or Stream.count() versus Collectors.counting(). What's the difference between these approaches? Is there a performance difference? Are they implemented differently in some way that affects how well they can be parallelized?

Tunaki
  • 132,869
  • 46
  • 340
  • 423
nasamuffin
  • 458
  • 3
  • 10
  • 2
    The collectors are useful when used as downstream operation of another collector. For example `groupingBy` takes a downstream collector collecting all the elements mapped to the same key. You couldn't use a Stream operation there. – Tunaki Apr 20 '16 at 20:07
  • Interesting that all of the collectors that take downstream collectors have gerund names, ending in `-ing` – Hank D Apr 20 '16 at 20:17
  • @Tunaki How does that make the same methods different. Eg `stream.collect(Collectors.counting())` vs. `stream.count()` or `stream.collect(Collectors.mapping(Foo::bar), anotherCollector)` vs. `stream.map(Foo::bar).collect(anotherCollector)`. It seems a bit redundant. – matt Apr 20 '16 at 20:55
  • 3
    True that there is no advantage to `collect(Collectors.counting())` but it does come in handy as a downstream collector, e.g. `collect(partitioningBy(String::isEmpty), Collectors.counting())` to split the results into 2 counts rather than 2 Lists – Hank D Apr 20 '16 at 22:41

2 Answers2

15

The collectors that appear to duplicate functionality in Stream exist so they can be used as downstream collectors for collector combinators like groupingBy().

As a concrete example, suppose you want to compute "count of transactions by seller". You could do:

Map<Seller, Long> salesBySeller = 
    txns.stream()
        .collect(groupingBy(Txn::getSeller, counting()));

Without collectors like counting() or mapping(), these kinds of queries would be much more difficult.

Brian Goetz
  • 90,105
  • 23
  • 150
  • 161
  • 4
    worth to note that Java 9 will add `filtering` and `flatMapping` to `Collectors` as well so there’s a convergence… – Holger Apr 21 '16 at 09:38
3

There's a big difference. The stream operations could be divided into two group:

  • Intermediate operations - Stream.map, Stream.flatMap, Stream.filter. Those produce instance of the Stream and are always lazy, e.g. no actual traversal of the Stream elements happens. Those operations are used to create transformation chain.
  • Terminal operations - Stream.collect, Stream.findFirst, Stream.reduce etc. Those do the actual work, e.g. perform the transformation chain operations on the stream, producing a terminal value. Which could be a List, count of element, first element etc.

Take a look at the Stream package summary javadoc for more information.

David Siro
  • 1,826
  • 14
  • 33