1

I got a legacy application using data structures like those in the following toy snippet and I can't easily change these data structures.

I use a Java 8 (only) stream to do some stats and I failed to get the wished type using Collectors.

package myIssueWithCollector;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.function.BinaryOperator;
import java.util.stream.Collectors;

public class MyIssueWithCollector {

    public static Double latitude(Map<String, String> map) {
    String latitude = map.get("LATITUDE");
    return Double.valueOf(latitude);
    }

    private static int latitudeComparator(double d1, double d2) {
    // get around the fact that NaN > +Infinity in Double.compare()
    if (Double.isNaN(d1) && !Double.isNaN(d2)) {
        return -1;
    }
    if (!Double.isNaN(d1) && Double.isNaN(d2)) {
        return 1;
    }
    return Double.compare(Math.abs(d1), Math.abs(d2));
    }

    public static Map<String, String> createMap(String city, String country, String continent, String latitude) {
    Map<String, String> map = new HashMap<>();
    map.put("CITY", city);
    map.put("COUNTRY", country);
    map.put("CONTINENT", continent);
    map.put("LATITUDE", latitude);
    return map;
    }

    public static void main(String[] args) {

    // Cities with dummies latitudes
    // I can not change easily these legacy data structures
    Map<String, String> map1 = createMap("London", "UK", "Europa", "48.1");
    Map<String, String> map2 = createMap("New York", "USA", "America", "42.4");
    Map<String, String> map3 = createMap("Miami", "USA", "America", "39.1");
    Map<String, String> map4 = createMap("Glasgow", "UK", "Europa", "49.2");
    Map<String, String> map5 = createMap("Camelot", "UK", "Europa", "NaN");

    List<Map<String, String>> maps = new ArrayList<>(4);
    maps.add(map1);
    maps.add(map2);
    maps.add(map3);
    maps.add(map4);
    maps.add(map5);

    //////////////////////////////////////////////////////////////////
    // My issue starts here:
    //////////////////////////////////////////////////////////////////
    Map<String, Map<String, Double>> result = maps.stream()
        .collect(Collectors.groupingBy(m -> m.get("CONTINENT"),
            Collectors.groupingBy(m -> m.get("COUNTRY"), Collectors.reducing(Double.NaN, m -> latitude(m),
                BinaryOperator.maxBy((d1, d2) -> latitudeComparator(d1, d2))))));

    System.out.println(result);
}
}

I need the result type to be Map<String, Map<String, String>> instead of Map<String, Map<String, Double>> by converting back "LATITUDE" from Double to String (using a custom format, not Double.toString() ).

I failed to achieve this with Collectors methods like andThen or collectingAndThen,...

I am currently stuck with Java 8.

Is there a way to get a Map<String, Map<String, String>> result using the same stream ?

Christophe
  • 119
  • 12
  • 2
    Is the custom format identical to the source format? – Holger Jan 26 '22 at 12:19
  • @Holger, the custom format will be not necessarily the same as the source format. For example, it may have a different decimal separator or different decimal precision. – Christophe Jan 26 '22 at 12:29

3 Answers3

4

Instead of using Collectors.reducing(…) with BinaryOperator.maxBy(…) you can also use Collectors.maxBy. Since this collector doesn’t support an identity value, it requires a finisher function to extract the value from an Optional, but your task requires a finisher anyway, to format the value.

Map<String, Map<String,String>> result = maps.stream()
    .collect(Collectors.groupingBy(m -> m.get("CONTINENT"),
        Collectors.groupingBy(m -> m.get("COUNTRY"),
            Collectors.mapping(MyIssueWithCollector::latitude,
                Collectors.collectingAndThen(
                    Collectors.maxBy(MyIssueWithCollector::latitudeComparator),
                    o -> format(o.get()))))));

This assumes format to be your custom format function like

private static String format(double d) {
    return String.format("%.2f", d);
}

But sometimes, it might be worthwhile to implement your own collector instead of combining multiple built-in collectors.

Map<String, Map<String,String>> result = maps.stream()
    .collect(Collectors.groupingBy(m -> m.get("CONTINENT"),
        Collectors.groupingBy(m -> m.get("COUNTRY"),
            Collector.of(
                () -> new double[]{Double.NEGATIVE_INFINITY},
                (a, m) -> {
                    double d = latitude(m);
                    if(!Double.isNaN(d)) a[0] = Double.max(a[0], d);
                },
                (a, b) -> a[0] >= b[0]? a: b,
                a -> format(a[0])))));

A collector maintains its state using a mutable container, this custom collector uses an array of length one to be able to hold a double value (which eliminates the need to box it to Double objects). Instead of implementing a special comparator to treat NaN specially, it uses a conditional, to never let NaN get into the array in the first place. That’s why the combiner doesn’t need to care about NaN; it can simply return the larger of the two values.

The finisher function just invokes the custom format function with the double value.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Many thanks @Holger for your classical solution and your awesome solution implementing a custom collector. Both solution work fine but the one with a custom collector is using only 3 collectors instead of 5 and probably much more efficient. Very useful. – Christophe Jan 26 '22 at 20:23
  • 1
    Actually, all solutions have just one collector, a composed one, as the composition happens first, even before the `collect` method is called. The performance differences in this regard are tiny, if ever noticeable. The eliminated boxing into `Double` objects can have an advantage, but only if you have rather large groups, in other word, lots of numbers to reduce to one final value. – Holger Jan 27 '22 at 08:01
  • thanks again @Holger for this explanation showing deep knowledge of java streams. – Christophe Jan 27 '22 at 08:34
2

You can use Collectors.collectingAndThen to convert the reduced double value to a corresponding String:

    Map<String, Map<String, String>> result = maps.stream().collect(
        Collectors.groupingBy(
            m -> m.get("CONTINENT"),
            Collectors.groupingBy(
                m -> m.get("COUNTRY"),
                Collectors.collectingAndThen(
                    Collectors.reducing(
                        Double.NaN,
                        m -> latitude(m),
                        BinaryOperator.maxBy(
                            (d1, d2) -> latitudeComparator(d1, d2)
                        )
                    ),
                    MyIssueWithCollector::myToString
                )
            )
        )
    );

Here, myToString is some method in the MyIssueWithCollector class to return String from double with your custom format, for example,

    public static String myToString(double d) {
        return "[latitude=" + d + "]";
    }
tueda
  • 750
  • 5
  • 14
  • Your solution works fine. I didn't succeed. The stream/collector classes are really worth be learned and mastered. Thanks a lot @tueda. – Christophe Jan 26 '22 at 12:22
1

Using Collectors reducing, you can maintain the latitude's String type in the identity so that the downstream collector is returning a String.

Map < String, Map < String, String >> result = maps.stream()
  .collect(
    Collectors.groupingBy(m - > m.get("CONTINENT"),
      Collectors.groupingBy(m - > m.get("COUNTRY"),
        Collectors.reducing("NaN", m - > m.get("LATITUDE"),
          BinaryOperator.maxBy((s1, s2) - > latitudeComparator(Double.valueOf(s1), Double.valueOf(s2)))))));
douglas
  • 116
  • 3
  • Thanks for your valid answer, @douglas. It works fine too and is simple. I don't know if there can be a little overhead due to ```String``` to ```Double``` transformations during reduction of stream with many items. – Christophe Jan 26 '22 at 20:02