6

I have an instances of Student class.

class Student {
    String name;
    String addr;
    String type;

    public Student(String name, String addr, String type) {
        super();
        this.name = name;
        this.addr = addr;
        this.type = type;
    }

    @Override
    public String toString() {
        return "Student [name=" + name + ", addr=" + addr + "]";
    }

    public String getName() {
        return name;
    }

    public String getAddr() {
        return addr;
    }
}

And I have a code to create a map , where it store the student name as the key and some processed addr values (a List since we have multiple addr values for the same student) as the value.

public class FilterId {

public static String getNum(String s) {
    // should do some complex stuff, just for testing
    return s.split(" ")[1];
}

public static void main(String[] args) {
    List<Student> list = new ArrayList<Student>();
    list.add(new Student("a", "test 1", "type 1"));
    list.add(new Student("a", "test 1", "type 2"));
    list.add(new Student("b", "test 1", "type 1"));
    list.add(new Student("c", "test 1", "type 1"));
    list.add(new Student("b", "test 1", "type 1"));
    list.add(new Student("a", "test 1", "type 1"));
    list.add(new Student("c", "test 3", "type 2"));
    list.add(new Student("a", "test 2", "type 1"));
    list.add(new Student("b", "test 2", "type 1"));
    list.add(new Student("a", "test 3", "type 1"));
    Map<String, List<String>> map = new HashMap<>();

    // This will create a Map with Student names (distinct) and the test numbers (distinct List of tests numbers) associated with them.
    for (Student student : list) {
        if (map.containsKey(student.getName())) {
            List<String> numList = map.get(student.getName());
            String value = getNum(student.getAddr());

            if (!numList.contains(value)) {
                numList.add(value);
                map.put(student.getName(), numList);
            }
        } else {
            map.put(student.getName(), new ArrayList<>(Arrays.asList(getNum(student.getAddr()))));
        }
    }

    System.out.println(map.toString());

}
}

Output would be : {a=[1, 2, 3], b=[1, 2], c=[1, 3]}

How can I just do the same in java8 in a much more elegant way, may be using the streams ?

Found this Collectors.toMap in java 8 but could't find a way to actually do the same with this.

I was trying to map the elements as CSVs but that it didn't work since I couldn't figure out a way to remove the duplicates easily and the output is not what I need at the moment.

Map<String, String> map2 = new HashMap<>();
map2 = list.stream().collect(Collectors.toMap(Student::getName, Student::getAddr, (a, b) -> a + " , " + b));
System.out.println(map2.toString());
// {a=test 1 , test 1 , test 1 , test 2 , test 3, b=test 1 , test 1 , test 2, c=test 1 , test 3}
Pshemo
  • 122,468
  • 25
  • 185
  • 269
prime
  • 14,464
  • 14
  • 99
  • 131

4 Answers4

17

With streams, you could use Collectors.groupingBy along with Collectors.mapping:

Map<String, Set<String>> map = list.stream()
    .collect(Collectors.groupingBy(
        Student::getName,
        Collectors.mapping(student -> getNum(student.getAddr()),
            Collectors.toSet())));

I've chosen to create a map of sets instead of a map of lists, as it seems that you don't want duplicates in the lists.


If you do need lists instead of sets, it's more efficient to first collect to sets and then convert the sets to lists:

Map<String, List<String>> map = list.stream()
    .collect(Collectors.groupingBy(
        Student::getName,
        Collectors.mapping(s -> getNum(s.getAddr()),
            Collectors.collectingAndThen(Collectors.toSet(), ArrayList::new))));

This uses Collectors.collectingAndThen, which first collects and then transforms the result.


Another more compact way, without streams:

Map<String, Set<String>> map = new HashMap<>(); // or LinkedHashMap
list.forEach(s -> 
    map.computeIfAbsent(s.getName(), k -> new HashSet<>()) // or LinkedHashSet
        .add(getNum(s.getAddr())));

This variant uses Iterable.forEach to iterate the list and Map.computeIfAbsent to group transformed addresses by student name.

fps
  • 33,623
  • 8
  • 55
  • 110
  • 4
    Like the use of `Collectors.collectingAndThen`. – tsolakp Feb 02 '18 at 17:52
  • Great. One small thins, append something to the key, what if I need my output as `{student-a=[1, 2, 3],student-b=[1, 2], student-c=[1, 3]}` – prime Feb 07 '18 at 07:37
  • @prime just change `Student::getName` by `s -> "student-" + s.getName()` – fps Feb 07 '18 at 10:57
  • @FedericoPeraltaSchaffner great. One thing, what if we need to exclude some elements from the `getNum`, like output will be `{a=[1, 2], b=[1, 2], c=[1, 3]}` Now the `a`'s list does not has the `3` because it failed some condition in `getNum`, how can we handle a condition like that. ex : If the type is `type 3` then we don't need to add that to the list. – prime Feb 07 '18 at 11:41
  • @prime if you are on Java 9, change getNum so that it returns a stream, if the condition is true, return a one element stream, if it's false return an empty stream. Then, when collecting, use Collectors.flatMapping instead of Collectors.mapping. there are other ways, maybe you need to write another question for that case, linking to this question for context – fps Feb 07 '18 at 12:04
  • @FedericoPeraltaSchaffner well it's on java 8 though – prime Feb 07 '18 at 12:09
  • @prime there's a backport from Holger, search it here in SO, it's a small static utility method. Something like flatmapping collector in Java 8, but honestly I would ask a new question asking specifically about the conditional adding of elements to the downstream set. So this question remains clear for future readers as to mapping and grouping, while the new one is about filtering in this context. – fps Feb 07 '18 at 12:13
  • @FedericoPeraltaSchaffner sure will ask a new one. – prime Feb 07 '18 at 12:16
  • @prime just for my understanding... Please confirm... You want to exclude some results returned by `getNum` based on a condition, but what would this condition be and what would it depend on? – fps Feb 07 '18 at 14:53
  • @FedericoPeraltaSchaffner Sure, consider this ex : If the type is type 3 then we don't need to add that to the set (in the map). – prime Feb 07 '18 at 17:55
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/164700/discussion-between-federico-peralta-schaffner-and-prime). – fps Feb 07 '18 at 18:24
5

First of all, the current solution is not really elegant, regardless of any streaming solution.

The pattern of

if (map.containsKey(k)) {
    Value value = map.get(k);
    ...
} else {
    map.put(k, new Value());
}

can often be simplified with Map#computeIfAbsent. In your example, this would be

// This will create a Map with Student names (distinct) and the test
// numbers (distinct List of tests numbers) associated with them.
for (Student student : list)
{
    List<String> numList = map.computeIfAbsent(
        student.getName(), s -> new ArrayList<String>());
    String value = getNum(student.getAddr());
    if (!numList.contains(value))
    {
        numList.add(value);
    }
}

(This is a Java 8 function, but it is still unrelated to streams).


Next, the data structure that you want to build there does not seem to be the most appropriate one. In general, the pattern of

if (!list.contains(someValue)) {
    list.add(someValue);
}

is a strong sign that you should not use a List, but a Set. The set will contain each element only once, and you will avoid the contains calls on the list, which are O(n) and thus may be expensive for larger lists.

Even if you really need a List in the end, it is often more elegant and efficient to first collect the elements in a Set, and afterwards convert this Set into a List in one dedicated step.

So the first part could be solved like this:

// This will create a Map with Student names (distinct) and the test
// numbers (distinct List of tests numbers) associated with them.
Map<String, Collection<String>> map = new HashMap<>();
for (Student student : list)
{
    String value = getNum(student.getAddr());
    map.computeIfAbsent(student.getName(), s -> new LinkedHashSet<String>())
        .add(value);
}

It will create a Map<String, Collection<String>>. This can then be converted into a Map<String, List<String>> :

// Convert the 'Collection' values of the map into 'List' values 
Map<String, List<String>> result = 
    map.entrySet().stream().collect(Collectors.toMap(
        Entry::getKey, e -> new ArrayList<String>(e.getValue())));

Or, more generically, using a utility method for this:

private static <K, V> Map<K, List<V>> convertValuesToLists(
    Map<K, ? extends Collection<? extends V>> map)
{
    return map.entrySet().stream().collect(Collectors.toMap(
        Entry::getKey, e -> new ArrayList<V>(e.getValue())));
}

I do not recommend this, but you also could convert the for loop into a stream operation:

Map<String, Set<String>> map = 
    list.stream().collect(Collectors.groupingBy(
        Student::getName, LinkedHashMap::new,
        Collectors.mapping(
            s -> getNum(s.getAddr()), Collectors.toSet())));

Alternatively, you could do the "grouping by" and the conversion from Set to List in one step:

Map<String, List<String>> result = 
    list.stream().collect(Collectors.groupingBy(
        Student::getName, LinkedHashMap::new,
        Collectors.mapping(
            s -> getNum(s.getAddr()), 
            Collectors.collectingAndThen(
                Collectors.toSet(), ArrayList<String>::new))));

Or you could introduce an own collector, that does the List#contains call, but all this tends to be far less readable than the other solutions...

Marco13
  • 53,703
  • 9
  • 80
  • 159
  • Upvoted! This is a very complete answer. I think you could have used `replaceAll` on each map's values instead of streaming on the entry sets and then collecting to new maps. – fps Feb 02 '18 at 18:10
  • 1
    @FedericoPeraltaSchaffner Yes, `replaceAll` (which I admittedly did not have on the screen until now) may be an alternative, depending on which type information you want to have for the map values (`Collection` vs `? extends Collection` - and maybe it *must* be a `List`...?). But this is only one of many degrees of freedom in the answer to this question. I think that when these degrees of freedom show up as different options of deeply nested `Collector` implementations, one should consider breaking it down into a `for` loop with simple, *named* operations, for the sake of readability... – Marco13 Feb 02 '18 at 18:32
  • Agreed, downstream collectors tend to become unreadable after the 2nd level, that's why I'd stay with the `Map.computeIfAbsent` solution. And using `Collectors.collectingAndThen` has always seemed too verbose to add a simple finishing transformation... – fps Feb 02 '18 at 18:41
3

I think you are looking for something like below

   Map<String,Set<String>> map =  list.stream().
           collect(Collectors.groupingBy(
                    Student::getName,
                    Collectors.mapping(e->getNum(e.getAddr()), Collectors.toSet())
                ));

   System.out.println("Map : "+map);
Amit Bera
  • 7,075
  • 1
  • 19
  • 42
2

Here is a version that collects everything in sets, and converts the final result to array lists:

/*
import java.util.*;
import java.util.stream.*;
import static java.util.stream.Collectors.*;
import java.util.function.*;
*/

Map<String, List<String>> map2 = list.stream().collect(groupingBy(
  Student::getName, // we will group the students by name
  Collector.of(
    HashSet::new, // for each student name, we will collect result in a hash set
    (arr, student) -> arr.add(getNum(student.getAddr())), // which we fill with processed addresses
    (left, right) -> { left.addAll(right); return left; }, // we merge subresults like this
    (Function<HashSet<String>, List<String>>) ArrayList::new // finish by converting to List
  )
));
System.out.println(map2);

// Output:
// {a=[1, 2, 3], b=[1, 2], c=[1, 3]}

EDIT: made the finisher shorter using Marco13's hint.

Andrey Tyukin
  • 43,673
  • 4
  • 57
  • 93
  • @Marco13 there's always something I can learn from Marco13 :] Thank you for the hint! I updated the answer. A separate variable declaration for the collector does not seem to be necessary. A `(Function, List>)` cast seems to be enough to select the right constructor of `ArrayList`. So, it's not a cast, it's rather some kind of type ascription, to be more precise. – Andrey Tyukin Feb 02 '18 at 21:51
  • OK, then I'll delete the distracting comment. But I also was a bit irritated why pulling it out into a variable (or "casting" as in your case) seemed to be necessary. I wonder whether this is expected, why the type inference seemed to hit a limit there, or whether there is a "nicer" solution, and consider asking this as a separate question (if I can't figure it out on my own, but I'll first have to take a closer look at that) – Marco13 Feb 03 '18 at 01:23
  • I had another look at this. The reason of why it cannot infer the type of the finisher is likely related to the fact that the return type cannot be derived from the `Collector.of` call alone: None of its arguments defines `R`, the return type, and it seems like it is not able to hand the target type information from the `map2` declaration through the `collect` and `groupingBy` calls up to the `Collector.of` call. Related: https://stackoverflow.com/a/24798163/3182664 (BTW: The current code issues an "unchecked" warning, which can be avoided with `ArrayList::new`) – Marco13 Feb 03 '18 at 15:12
  • @Marco13 openjdk version "1.8.0_144" issues nothing, compiles just fine even with -Werror ?.. – Andrey Tyukin Feb 03 '18 at 18:18
  • You're right. I tried it in Eclipse Neon (pretty old...) and it issued a warning, but with the latest (Eclipse Oxygen) everything is fine. Seems to be a ECJ bug that has been fixed in the meantime. – Marco13 Feb 03 '18 at 19:49