7

I have a Java bean, like

class EmployeeContract {
    Long id;
    Date date;
    getter/setter
}

If a have a long list of these, in which we have duplicates by id but with different date, such as:

1, 2015/07/07
1, 2018/07/08
2, 2015/07/08
2, 2018/07/09

How can I reduce such a list keeping only the entries with the most recent date, such as:

1, 2018/07/08
2, 2018/07/09

? Preferably using Java 8...

I've started with something like:

contract.stream()
         .collect(Collectors.groupingBy(EmployeeContract::getId, Collectors.mapping(EmployeeContract::getId, Collectors.toList())))
                    .entrySet().stream().findFirst();

That gives me the mapping within individual groups, but I'm stuck as to how to collect that into a result list - my streams are not too strong I'm afraid...

ETO
  • 6,970
  • 1
  • 20
  • 37
Nestor Milyaev
  • 5,845
  • 2
  • 35
  • 51
  • 2
    I wanted to post an answer but this one was closed too fast... `yourList.stream() .collect(Collectors.toMap( EmployeeContract::getId, Function.identity(), BinaryOperator.maxBy(Comparator.comparing(EmployeeContract::getDate).reversed())) ) .values();` – Eugene Nov 27 '18 at 17:22
  • 3
    @Eugene instead of `BinaryOperator.maxBy( … .reversed())`, you can use `BinaryOperator.minBy(…)`. Though in this case, it looks like the OP wants `maxBy`, without `.reversed()`. – Holger Nov 27 '18 at 17:43
  • 2
    @Holger and given that this(`values()`) would return a `Collection` and not precisely a `List`, is there a concise way to resolve that? – Naman Nov 27 '18 at 17:51
  • 1
    Given there is a valid discussion and it's a bona fide quiestion, perhaps it's worth to un-hold this question? – Nestor Milyaev Nov 27 '18 at 17:53
  • @Holger indeed... – Eugene Nov 27 '18 at 17:55
  • 4
    @nullpointer if it really needs to be a `List`, you can a) wrap the entire expression in a `new ArrayList<>( … )` or b) wrap the collector in a `Collectors.collectingAndThen( …, m -> new ArrayList<>(m.values()))`. – Holger Nov 27 '18 at 17:57
  • 1
    Use [`LocalDate`](https://docs.oracle.com/javase/10/docs/api/java/time/LocalDate.html) for a date-only value without time-of-day and without time zone. Never use `Date` (a terrible class, now legacy). – Basil Bourque Nov 27 '18 at 20:56

4 Answers4

12

Well, I am just going to put my comment here in the shape of an answer:

 yourList.stream()
         .collect(Collectors.toMap(
                  EmployeeContract::getId,
                  Function.identity(),
                  BinaryOperator.maxBy(Comparator.comparing(EmployeeContract::getDate)))
            )
         .values();

This will give you a Collection instead of a List, if you really care about this.

Eugene
  • 117,005
  • 15
  • 201
  • 306
1

You can do it in two steps as follows :

List<EmployeeContract> finalContract = contract.stream() // Stream<EmployeeContract>
        .collect(Collectors.toMap(EmployeeContract::getId, 
                EmployeeContract::getDate, (a, b) -> a.after(b) ? a : b)) // Map<Long, Date> (Step 1)
        .entrySet().stream() // Stream<Entry<Long, Date>>
        .map(a -> new EmployeeContract(a.getKey(), a.getValue())) // Stream<EmployeeContract>
        .collect(Collectors.toList()); // Step 2

First step: ensures the comparison of dates with the most recent one mapped to an id.

Second step: maps these key, value pairs to a final List<EmployeeContract> as a result.

Naman
  • 27,789
  • 26
  • 218
  • 353
1

Just to complement the existing answers, as you're asking:

how to collect that into a result list

Here are some options:

  • Wrap the values() into an ArrayList:

    List<EmployeeContract> list1 = 
        new ArrayList<>(list.stream()            
                            .collect(toMap(EmployeeContract::getId,                                                                          
                                           identity(),
                                           maxBy(comparing(EmployeeContract::getDate))))
                            .values());
    
  • Wrap the toMap collector into collectingAndThen:

    List<EmployeeContract> list2 = 
        list.stream()
            .collect(collectingAndThen(toMap(EmployeeContract::getId,
                                             identity(),
                                             maxBy(comparing(EmployeeContract::getDate))),
                     c -> new ArrayList<>(c.values())));
    
  • Collect the values to a new List using another stream:

    List<EmployeeContract> list3 = 
        list.stream()
            .collect(toMap(EmployeeContract::getId,
                           identity(),
                           maxBy(comparing(EmployeeContract::getDate))))
            .values()
            .stream()
            .collect(toList());
    
ETO
  • 6,970
  • 1
  • 20
  • 37
0

With vavr.io you can do it like this:

var finalContract = Stream.ofAll(contract) //create io.vavr.collection.Stream
            .groupBy(EmployeeContract::getId)
            .map(tuple -> tuple._2.maxBy(EmployeeContract::getDate))
            .collect(Collectors.toList()); //result is list from java.util package
jker
  • 465
  • 3
  • 13
  • Not sure to understand the difference with a classic Stream from java.util.stream ? – azro Nov 27 '18 at 22:35
  • in io.vavr collections objects are immutable and they have map(), filter() etc methods – jker Nov 27 '18 at 22:39