Optimize Java 8 functional algorithm compared to its procedural counter part

Question

I'm trying to optimize a functional operation in Java 8 compared to its procedural equivalent but I'm having some serious performance issue.

Situation

I must parse HTTP Headers String, List<String> from the value of a given enumeration which maps a HeaderName to many possible variations String, Set<String>.

Example

Given the following HttpHeaders :

public static final Map<String, List<String>> httpHeaders = new HashMap<>();

httpHeaders.put("Content-Type", Arrays.asList("application/json", "text/x-json"));
httpHeaders.put("SID", Arrays.asList("ABC123"));
httpHeaders.put("CORRELATION-ID", Arrays.asList("ZYX666"));

And my custom enumeration :

LogHeaders

protected final String key;
protected final Set<String> variation;

SESSION_ID("_sid", Arrays.asList("SESSION-ID", "SID"));    
CORRELATION_ID("cid", Arrays.asList("CORRELATION-ID", "CID")),


  private LogHeaders(final String logKey, final List<String> logKeyVariations) {

        this.logKey = logKey;
        this.logKeyVariations = new HashSet<>(logKeyVariations);
    }

@Override
    public String toString() {

        return this.logKey;
    }

The result should be the a map of "LogHeaders.key" with the value set of the corresponding variations from the HttpHeaders. Only one variation is possible for a given header :

// {LogHeaders.key : HttpHeaderValue>
{
_sid=[ABC123], 
_cid=[ZYX666]
}

Procedural code

final Map<String, List<String>> logHeadersToValue = new HashMap<>();

for (final LogHeaders header : LogHeaders.values()) {
  for (final String variation : header.getLogKeyVariations()) {
    final List<String> headerValue = httpHeaders.get(variation);
      if (headerValue != null) {
        logHeadersToValue.put(header.logKey, headerValue);
        break;
      }
  }
}

Functional code

final Map<String, List<String>> logHeadersToValue =
EnumSet.allOf(LogHeaders.class)
  .stream()
  .collect(Collectors.toMap(
    LogHeaders::toString,
    logHeader -> logHeader.getLogKeyVariations().stream()
      .map(variation -> httpHeaders.get(variation)).filter(Objects::nonNull)
      .collect(singletonCollector())));


public static <T> Collector<T, ?, T> singletonCollector() {

  return Collectors.collectingAndThen(Collectors.toList(), list -> {
      if (list.size() < 1) {
        return null;
      }
      return list.get(0);
    });
}

Current Benchmark

FunctionalParsing : 0.086s

ProceduralParsing : 0.001s

Do you have any idea how I could optimize my functional part?

Thank you

UPDATE BENCHMARK

I ran a 100k warmup + 100k iteration with @Tagir Valeev code :

FunctionalParsing : 0.040s

ProceduralParsing : 0.010s

UPDATE BENCHMARK #2

I ran a 100k warmup + 100k iteration with @Misha code :

FunctionalParsing : 0.025s

ProceduralParsing : 0.017s

First question is obviously, how did you measure performance, second one is how big are these collections roughly? — biziclop, Jul 14 '15 at 16:18
You seem to be a victim of a classic mistake when it comes to any new and shiny functionality f: use f for the sake of using f and you're f; your old code works just fine, so why bother trying using f when it clearly has no benefits? — fge, Jul 14 '15 at 16:22
forget performance - the "functional" code is totally unreadable — ZhongYu, Jul 14 '15 at 16:44
I'm open to any "readability" suggestion as well. I used Junit test output. I will edit my question with JMH benchmark when I have them ready. — Daniel Marcotte, Jul 14 '15 at 16:59
@biziclop I used Junit, will update with proper benchmark based on your comments. For the collection size, the enum will be small but the HttpHeaders should be between 5 and 20~ — Daniel Marcotte, Jul 14 '15 at 17:16
@JacobZimmerman Parallel streams work well for medium/long-running operations. However, the thread synchronization overhead is not worth it for very quick computation. It brings down performances in my scenario. — Daniel Marcotte, Jul 15 '15 at 14:22
Ya parallel collections will almost always perform worse. I don't understand why you're trying to change this. Functional code will generally perform slightly worst and have greater space complexity. If you have working code then you don't need to change it. If you're going to refactor your code to be more functional, then it should be for readability and maintainability reasons, not to make it faster. And so, you shouldn't try to optimize unless there is a performance problem and you've found that this particular hunk is the worst in your app. — JasonG, Jul 15 '15 at 14:50
I was developing new code and I was curious about what is the most efficient approach. That's why I coded the functional and procedural side by side. — Daniel Marcotte, Jul 15 '15 at 14:57

score 4 · Answer 1 · answered Jul 14 '15 at 16:35

I absolutely sure that you did the benchmark incorrectly. It's likely that you executed it only once. You don't care if your program works 0.001 second or 0.086 seconds, right? It's still faster than you can blink. So you probably want to run this code many times. But it seems that you measured the time only once and erroneously suppose that every successive run will take roughly the same time. During the first launch code is mostly executed by interpreter, while it will be JIT-compiled later and will work much faster. This is really crucial for stream-related code.

As for your code it seems that having custom collector is unnecessary. You may implement it like this:

final Map<String, List<String>> logHeadersToValue =
        EnumSet.allOf(LogHeaders.class)
          .stream()
          .collect(Collectors.toMap(
            LogHeaders::toString,
            logHeader -> logHeader.getLogKeyVariations().stream()
              .map(httpHeaders::get).filter(Objects::nonNull)
              .findFirst().orElse(null)));

This solution may also be faster as it will not read more than one http header (like it's done via break in procedural code).

The results are from Junit run. I will run them 10k times and get some results for another 10k iterations to let the JIT do its work. Thank you for your proposal. — Daniel Marcotte, Jul 14 '15 at 17:02
I did a 100k warmup + a 100k benchmark. The functional parsing runs in 0.046s and the procedural parsing runs in 0.014s. Basically 4x~ slower which is not a big deal. — Daniel Marcotte, Jul 14 '15 at 17:31

score 2 · Accepted Answer · edited Jul 15 '15 at 15:27

Your functional code doesn't do the same thing as your original. If one of the LogHeaders fails to match a header, the old code would just skip it, while the functional code would throw a NullPointerException.

A direct translation of your original code to streams would look like this:

Map<String, List<String>> logHeadersToValue =  Arrays.stream(LogHeaders.values())
    .collect(
        HashMap::new,
        (map, logHeader) -> logHeader.getLogKeyVariations().stream()
            .filter(httpHeaders::containsKey)
            .findAny()
            .ifPresent(x -> map.put(logHeader.key, httpHeaders.get(x))),
        Map::putAll
    );

If you want this to be more performant and easy to read, consider precomputing a Map<String,String> of every variation to its key. You can do it by modifying the enum like this:

enum LogHeaders {

    SESSION_ID("_sid", "SESSION-ID", "SID"),
    CORRELATION_ID("cid", "CORRELATION-ID", "CID");

    final String key;
    final Map<String, String> variations;

    private LogHeaders(final String key, String... variation) {
        this.key = key;
        variations = Arrays.stream(variation).collect(collectingAndThen(
                toMap(x -> x, x -> key),
                Collections::unmodifiableMap
        ));
    }

    // unmodifiable map of every variation to its key
    public final static Map<String, String> variationToKey =
        Arrays.stream(LogHeaders.values())
            .flatMap(lh -> lh.variations.entrySet().stream())
            .collect(collectingAndThen(
                            toMap(Map.Entry<String, String>::getKey, Map.Entry<String, String>::getValue),
                            Collections::unmodifiableMap
            ));  // will throw if 2 keys have the same variation
}

This approach has the advantage of failing fast if there are duplicate variations. And now the code becomes very simple:

Map<String, List<String>> logHeadersToValue = LogHeaders.variationToKey.keySet().stream()
    .filter(httpHeaders::containsKey)
    .collect(toMap(LogHeaders.variationToKey::get, httpHeaders::get));

That's smart and it removes computation complexity in the algorithm. Thank you for your input. I'm updating my code and will post a new benchmark. — Daniel Marcotte, Jul 15 '15 at 13:31