The groupingBy(function)
collector is a short-hand for groupingBy(function, toList())
. While there are no guarantees about the order of the map entries, the lists collected by toList()
will reflect the encounter order.
While it is not spelled out explicitly, how the groupingBy
collector affects the operation of the downstream collector regarding the order, there is a remark in the documentation about the difference to groupingByConcurrent(Function)
:
For parallel stream pipelines, the combiner
function operates by merging the keys from one map into another, which can be an expensive operation. If preservation of the order in which elements appear in the resulting Map
collector is not required, using groupingByConcurrent(Function)
may offer better parallel performance.
This hints at the fact that the groupingBy
collector will preserve the order if the downstream collector is not unordered, whereas the groupingByConcurrent
collector will enforce unordered behavior, even if the downstream collector would otherwise retain the order, like toList()
.
So the following program
public static void main(String[] args) {
System.out.println(isUnordered(Collectors.groupingBy(Function.identity())));
System.out.println(isUnordered(Collectors.groupingByConcurrent(Function.identity())));
}
static boolean isUnordered(Collector<?, ?, ?> c) {
return c.characteristics().contains(Collector.Characteristics.UNORDERED);
}
will print
false
true
As a reminder to those who didn’t read carefully enough, we’re only talking about the order of the elements within the lists, not the order of the entries within the map.
So when we take the following program, for example
Map<Integer, List<Integer>> m = IntStream.rangeClosed(0, 10).boxed()
.collect(Collectors.groupingBy(i -> i % 2));
System.out.println(m.keySet());
System.out.println(m.get(0));
System.out.println(m.get(1));
There is no guaranty whether the first line will be [0, 1]
or [1, 0]
, as the map’s order is unspecified, but we can be sure that the other two lines will be
[0, 2, 4, 6, 8, 10]
[1, 3, 5, 7, 9]
as the lists have a guaranteed order and m.get(0)
/m.get(1)
do not rely on the map’s order.