6

Are there any differences between the count() method in the Stream interface vs counting() in Collectors? Which one should one use in general? Are there any performance benefits of using one over the other?

Naman
  • 27,789
  • 26
  • 218
  • 353
Hiro_Hamada
  • 149
  • 2
  • 12
  • I don't know about performance, but to me, the most important difference is that collectors can be used as the downstream of other collectors. You can use `counting` (but not `count`) as the downstream of `groupingBy` or `mapping`, for example. – Sweeper Jun 17 '21 at 10:56
  • 4
    We had [min() and max() vs Collectors minBy() and maxBy()](https://stackoverflow.com/q/67998785/2711488) just yesterday. We don’t need a new question for every “collector vs terminal operation”. – Holger Jun 17 '21 at 12:14
  • 2
    Does this answer your question? [Collectors.summingInt() vs mapToInt().sum()](https://stackoverflow.com/questions/37023822/collectors-summingint-vs-maptoint-sum) – Gautham M Jun 18 '21 at 04:54

2 Answers2

11

If you want to count (all) stream elements use count()

Returns the count of elements in this stream. This is a special case of a reduction and is equivalent to:

return mapToLong(e -> 1L).sum();

Use counting() when you need the count to be grouped, as:

Map<String, Long> collect = 
   wordsList.stream().collect(groupingBy(Function.identity(), counting())); 
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Ori Marko
  • 56,308
  • 23
  • 131
  • 233
11

One difference is with the implementation. This is documented on Stream.count() (at least on versions 9+):

An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) if it is capable of computing the count directly from the stream source. In such cases no source elements will be traversed and no intermediate operations will be evaluated...

Try this:

System.out.println(Stream.of(1, 2, 3, 4, 5)
        .map(i -> {
            System.out.println("processing " + i);
            return i * 2;
        }).count());

System.out.println(Stream.of(1, 2, 3, 4, 5)
        .map(i -> {
            System.out.println("processing " + i);
            return i * 2;
        }).collect(Collectors.counting()));

Both will give you the correct count, but the first one may skip the execution of map(), thus might not show println's output. As documented, this is implementation detail (count() may skip intermediate operations if it can determine the number of elements without executing the pipeline)

ernest_k
  • 44,416
  • 5
  • 53
  • 99
  • @user7294900 as documented, it's implementation-related. Spec says *may*. On mine, it skips the `map` operation. The bottom line is that the developer should not count on count() causing all operations before it to run. – ernest_k Jun 17 '21 at 11:02
  • @user7294900 What Java version and distribution are you using? – ernest_k Jun 17 '21 at 11:05
  • Java 8 Oracle distribution – Ori Marko Jun 17 '21 at 11:08
  • @user7294900 OpenJDK 8 too gives the output. But OpenJDK 11 seems to be optimized. I suspect even Oracle JDK 11 would be. – ernest_k Jun 17 '21 at 11:15