57

For example, if I intend to partition some elements, I could do something like:

Stream.of("I", "Love", "Stack Overflow")
      .collect(Collectors.partitioningBy(s -> s.length() > 3))
      .forEach((k, v) -> System.out.println(k + " => " + v));

which outputs:

false => [I]
true => [Love, Stack Overflow]

But for me partioningBy is only a subcase of groupingBy. Although the former accepts a Predicate as parameter while the latter a Function, I just see a partition as a normal grouping function.

So the same code does exactly the same thing:

 Stream.of("I", "Love", "Stack Overflow")
       .collect(Collectors.groupingBy(s -> s.length() > 3))
       .forEach((k, v) -> System.out.println(k + " => " + v));

which also results in a Map<Boolean, List<String>>.

So is there any reason I should use partioningBy instead of groupingBy? Thanks

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
user2336315
  • 15,697
  • 10
  • 46
  • 64

5 Answers5

75

partitioningBy will always return a map with two entries, one for where the predicate is true and one for where it is false. It is possible that both entries will have empty lists, but they will exist.

That's something that groupingBy will not do, since it only creates entries when they are needed.

At the extreme case, if you send an empty stream to partitioningBy you will still get two entries in the map whereas groupingBy will return an empty map.

EDIT: As mentioned below this behavior is not mentioned in the Java docs, however changing it would take away the added value partitioningBy is currently providing. For Java 9 this is already in the specs.

Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
Oron
  • 931
  • 7
  • 9
  • 3
    This is the most reasonable behavior, but I don't see the guarantee for two entries in the javadocs. I've asked a question about it at http://stackoverflow.com/questions/41287517/must-partitioningby-produce-a-map-with-entries-for-true-and-false. – Joshua Taylor Dec 22 '16 at 16:21
  • 1
    @JoshuaTaylor Thanks for the info! I updated the answer to include the information from the other thread. – Oron Dec 23 '16 at 16:42
20

partitioningBy is slightly more efficient, using a special Map implementation optimized for when the key is just a boolean.

(It might also help to clarify what you mean; partitioningBy helps to effectively get across that there's a boolean condition being used to partition the data.)

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • 7
    Considering the generally strong preference of Java APIs not to include convenience special-cased methods, I am slightly surprised that this as all the argumentation it took to include it. A minor performance edge and a minor clarification. I see Doug Lea's thinking in it :) – Marko Topolnik Jan 16 '15 at 22:30
  • 1
    Besides of this, there's one more small difference: if all elements satisfies the Predict, the partitioningBy result still contains a false key mapping to an empty list, whereas the groupingBy result won't have the false key. – MGhostSoft Sep 02 '16 at 01:00
  • 1
    The "slightly more efficient" implementation creates 6 objects on each call of `get`. Just because they didn't bother to override `get` with a trivial implementation `return key ? forTrue : forFalse`. – Marko Topolnik Oct 14 '16 at 14:12
  • 1
    @MGhostSoft That behavior is what makes partitioningBy preferable in some cases to grouping. However, while experiments confirm the behavior, I don't see it specified in the docs. It seems reasonable, but I'm hesitant to depend on behavior that's not specified. Do you know of any guarantee about the result containing true and false entries? This is the most reasonable behavior, but I don't see the guarantee for two entries in the javadocs. I've asked a question about it at http://stackoverflow.com/questions/41287517/must-partitioningby-produce-a-map-with-entries-for-true-and-false. – Joshua Taylor Dec 22 '16 at 16:22
  • @JoshuaTaylor The only official mention I could find is here: http://www.oracle.com/webfolder/technetwork/tutorials/moocjdk8/documents/week3/lesson-3-4.pdf slide 10: it specifically said that partitioningBy will create two groups. Other sources only mentioned a map with which the key is a Boolean type. There should be no doubt about the groupingBy behavior. – MGhostSoft Jan 12 '17 at 00:20
  • 2
    @Mghostsoft take a look at the question I linked to, the answer says that the guarantee was added in the javadoc for Java 9. – Joshua Taylor Jan 12 '17 at 02:20
  • Another potential difference: `groupingBy` has variations specifically enabling the `CONCURRENT` characteristic flag, and `partitioningBy` does not. (I freely admit there may be a trivially easy way to achieve concurrency with `partitioningBy`, maybe some kind of concurrent map, but I don't know the Collectors factory functions very well yet.) – Ti Strga Jan 27 '17 at 23:11
4

partitioningBy method will return a map whose key is always a Boolean value, but in case of groupingBy method, the key can be of any Object type

//groupingBy
Map<Object, List<Person>> list2 = new HashMap<Object, List<Person>>();
list2 = list.stream().collect(Collectors.groupingBy(p->p.getAge()==22));
System.out.println("grouping by age -> " + list2);

//partitioningBy
Map<Boolean, List<Person>> list3 = new HashMap<Boolean, List<Person>>();
list3 = list.stream().collect(Collectors.partitioningBy(p->p.getAge()==22));
System.out.println("partitioning by age -> " + list2);

As you can see, the key for map in case of partitioningBy method is always a Boolean value, but in case of groupingBy method, the key is Object type

Detailed code is as follows:

    class Person {
    String name;
    int age;

    Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    public String toString() {
        return this.name;
    }
}

public class CollectorAndCollectPrac {
    public static void main(String[] args) {
        Person p1 = new Person("Kosa", 21);
        Person p2 = new Person("Saosa", 21);
        Person p3 = new Person("Tiuosa", 22);
        Person p4 = new Person("Komani", 22);
        Person p5 = new Person("Kannin", 25);
        Person p6 = new Person("Kannin", 25);
        Person p7 = new Person("Tiuosa", 22);
        ArrayList<Person> list = new ArrayList<>();
        list.add(p1);
        list.add(p2);
        list.add(p3);
        list.add(p4);
        list.add(p5);
        list.add(p6);
        list.add(p7);

        // groupingBy
        Map<Object, List<Person>> list2 = new HashMap<Object, List<Person>>();
        list2 = list.stream().collect(Collectors.groupingBy(p -> p.getAge() == 22));
        System.out.println("grouping by age -> " + list2);

        // partitioningBy
        Map<Boolean, List<Person>> list3 = new HashMap<Boolean, List<Person>>();
        list3 = list.stream().collect(Collectors.partitioningBy(p -> p.getAge() == 22));
        System.out.println("partitioning by age -> " + list2);

    }
}
3

Another difference between groupingBy and partitioningBy is that the former takes a Function<? super T, ? extends K> and the latter a Predicate<? super T>.

When you pass a method reference or a lambda expression, such as s -> s.length() > 3, they can be used by either of these two methods (the compiler will infer the functional interface type based on the type required by the method you choose).

However, if you have a Predicate<T> instance, you can only pass it to Collectors.partitioningBy(). It won't be accepted by Collectors.groupingBy().

And similarly, if you have a Function<T,Boolean> instance, you can only pass it to Collectors.groupingBy(). It won't be accepted by Collectors.partitioningBy().

Eran
  • 387,369
  • 54
  • 702
  • 768
0

As denoted by the other answers, segregating a collection into two groups is useful in some scenarios. As these two partitions would always exist, it would be easier to utilize it further. In JDK, to segregate all the class files and config files, partitioningBy is used.

    private static final String SERVICES_PREFIX = "META-INF/services/";
    
    // scan the names of the entries in the JAR file
    Map<Boolean, Set<String>> map = jf.versionedStream()
            .filter(e -> !e.isDirectory())
            .map(JarEntry::getName)
            .filter(e -> (e.endsWith(".class") ^ e.startsWith(SERVICES_PREFIX)))
            .collect(Collectors.partitioningBy(e -> e.startsWith(SERVICES_PREFIX),
                                               Collectors.toSet()));

    Set<String> classFiles = map.get(Boolean.FALSE);
    Set<String> configFiles = map.get(Boolean.TRUE);

Code snippet is from jdk.internal.module.ModulePath#deriveModuleDescriptor

RamValli
  • 4,389
  • 2
  • 33
  • 45