4

Guava's Enums.ifPresent(Class, String) calls Enums.getEnumConstants under the hood:

@GwtIncompatible // java.lang.ref.WeakReference
static <T extends Enum<T>> Map<String, WeakReference<? extends Enum<?>>> getEnumConstants(
Class<T> enumClass) {
    synchronized (enumConstantCache) {
        Map<String, WeakReference<? extends Enum<?>>> constants = enumConstantCache.get(enumClass);
        if (constants == null) {
            constants = populateCache(enumClass);
        }
        return constants;
    }
}

Why does it need a synchronized block? Wouldn't that incur a heavy performance penalty? Java's Enum.valueOf(Class, String) does not appear to need one. Further on if synchronization is indeed necessary, why do it so inefficiently? One would hope if enum is present in cache, it can be retrieved without locking. Only lock if cache needs to be populated.

For Reference: Maven Dependency

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>23.2-jre</version>
</dependency>

Edit: By locking I'm referring to a double checking lock.

  • `Wouldn't that incur a heavy performance penalty?` Not necessarily. In particular I think "heavy" is a flat no. The code in the synchronized block looks like its accessing a static (global) cache object, so the synchronization is likely needed for that. I'm guessing Java's `valueOf()` method accesses an *immutable* object and therefore doen't require mutex or other memory visibility operations. – markspace Dec 01 '17 at 20:25
  • It sounds like you're thinking of double checked locking in your last sentence. Uncontested synchronization is very fast on modern JVMs. I think it's being used here to create a memory barrier, so that when `constants` is populated, all other threads are guaranteed to see it immediately, but I will let someone with more expertise answer. – David Conrad Dec 01 '17 at 20:29
  • Oh right, good catch. `if enum is present in cache, it can be retrieved without locking.` No. No no no no. Visibility (between threads) in Java requires a lock by both a writer and then a reader, and both have to use the same lock. Anything else is broken in the Java memory model. Reads require locking just as writes do. This is why immutability is a big deal, it does *NOT* require the reader to do anything special for memory visibility. Please read Brian Goetz's [Java Concurrency in Practice](http://jcip.net/) for details. – markspace Dec 01 '17 at 20:38
  • https://softwareengineering.stackexchange.com/questions/150115/why-would-one-ever-want-to-use-a-synchronized-method-in-enum – alan07sl Dec 01 '17 at 20:47
  • @markspace you are right that heavy is far too strong of a word here. I was imagining a scenario with a ThreadPoolExecutor and say 10 threads, each one having to convert String values to Enums and then perform some task (example: making an API call). Wouldn't those 10 threads end up stepping on each others toes a little bit? If yes, then wouldn't using Java's valueOf() be preferred? Also thank you for the book recommendation. My understanding of concurrency is very mediocre and I would definitely benefit from reading it. – Vlad Poskatcheev Dec 01 '17 at 23:47
  • For questions like that, the only answer I can give is to run performance testing and look for bottlenecks. Most likely, no matter how many threads you run this small bit of code won't end up anywhere near the performance bottleneck. It's very rare that four lines of code will be hit so hard that it will cause multi-threading problems. – markspace Dec 02 '17 at 05:19
  • @markspace I ran a very crude performance test and posted it below. – Vlad Poskatcheev Dec 05 '17 at 22:04

2 Answers2

4

I've accepted @maaartinus answer, but wanted to write a separate "answer" about the circumstances behind the question and the interesting rabbit hole it lead me to.

tl;dr - Use Java's Enum.valueOf which is thread safe and does not sync unlike Guava's Enums.ifPresent. Also in majority of cases it probably doesn't matter.

Long story:

I'm working on a codebase that utilizes light weight java threads Quasar Fibers. In order to harness the power of Fibers, the code they run should be primarily async and non-blocking because Fibers are multiplexed to Java/OS Threads. It becomes very important that individual Fibers do not "block" the underlying thread. If underlying thread is blocked, it will block all Fibers running on it and performance degrades considerably. Guava's Enums.ifPresent is one of those blockers and I'm certain it can be avoided.

Initially, I started using Guava's Enums.ifPresent because it returns null on invalid enum values. Unlike Java's Enum.valueOf which throws IllegalArgumentException (which to my taste is less preferrable than a null value).

Here is a crude benchmark comparing various methods of converting to enums:

  1. Java's Enum.valueOf with catching IllegalArgumentException to return null
  2. Guava's Enums.ifPresent
  3. Apache Commons Lang EnumUtils.getEnum
  4. Apache Commons Lang 3 EnumUtils.getEnum
  5. My Own Custom Immutable Map Lookup

Notes:

  • Apache Common Lang 3 uses Java's Enum.valueOf under the hood and are hence identical
  • Earlier version of Apache Common Lang uses a very similar WeakHashMap solution to Guava but does not use synchronization. They favor cheap reads and more expensive writes (my knee jerk reaction says that's how Guava should have done it)
  • Java's decision to throw IllegalArgumentException is likely to have a small cost associated with it when dealing with invalid enum values. Throwing/catching exceptions isn't free.
  • Guava is the only method here that uses synchronization

Benchmark Setup:

  • uses an ExecutorService with a fixed thread pool of 10 threads
  • submits 100K Runnable tasks to convert enums
  • each Runnable task converts 100 enums
  • each method of converting enums will convert 10 million strings (100K x 100)

Benchmark Results from a run:

Convert valid enum string value:
    JAVA -> 222 ms
    GUAVA -> 964 ms
    APACHE_COMMONS_LANG -> 138 ms
    APACHE_COMMONS_LANG3 -> 149 ms
    MY_OWN_CUSTOM_LOOKUP -> 160 ms

Try to convert INVALID enum string value:
    JAVA -> 6009 ms
    GUAVA -> 734 ms
    APACHE_COMMONS_LANG -> 65 ms
    APACHE_COMMONS_LANG3 -> 5558 ms
    MY_OWN_CUSTOM_LOOKUP -> 92 ms

These numbers should be taken with a heavy grain of salt and will change depending on other factors. But they were good enough for me to conclude to go with Java's solution for the codebase using Fibers.

Benchmark Code:

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

import com.google.common.base.Enums;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.ImmutableMap.Builder;

public class BenchmarkEnumValueOf {

    enum Strategy {
        JAVA,
        GUAVA,
        APACHE_COMMONS_LANG,
        APACHE_COMMONS_LANG3,
        MY_OWN_CUSTOM_LOOKUP;

        private final static ImmutableMap<String, Strategy> lookup;

        static {
            Builder<String, Strategy> immutableMapBuilder = ImmutableMap.builder();
            for (Strategy strategy : Strategy.values()) {
                immutableMapBuilder.put(strategy.name(), strategy);
            }

            lookup = immutableMapBuilder.build();
        }

        static Strategy toEnum(String name) {
            return name != null ? lookup.get(name) : null;
        }
    }

    public static void main(String[] args) {
        final int BENCHMARKS_TO_RUN = 1;

        System.out.println("Convert valid enum string value:");
        for (int i = 0; i < BENCHMARKS_TO_RUN; i++) {
            for (Strategy strategy : Strategy.values()) {
                runBenchmark(strategy, "JAVA", 100_000);
            }
        }

        System.out.println("\nTry to convert INVALID enum string value:");
        for (int i = 0; i < BENCHMARKS_TO_RUN; i++) {
            for (Strategy strategy : Strategy.values()) {
                runBenchmark(strategy, "INVALID_ENUM", 100_000);
            }
        }
    }

    static void runBenchmark(Strategy strategy, String enumStringValue, int iterations) {
        ExecutorService executorService = Executors.newFixedThreadPool(10);

        long timeStart = System.currentTimeMillis();

        for (int i = 0; i < iterations; i++) {
            executorService.submit(new EnumValueOfRunnable(strategy, enumStringValue));
        }

        executorService.shutdown();

        try {
            executorService.awaitTermination(1000, TimeUnit.SECONDS);
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        }

        long timeDuration = System.currentTimeMillis() - timeStart;

        System.out.println("\t" + strategy.name() + " -> " + timeDuration + " ms");
    }

    static class EnumValueOfRunnable implements Runnable {

        Strategy strategy;
        String enumStringValue;

        EnumValueOfRunnable(Strategy strategy, String enumStringValue) {
            this.strategy = strategy;
            this.enumStringValue = enumStringValue;
        }

        @Override
        public void run() {
            for (int i = 0; i < 100; i++) {
                switch (strategy) {
                    case JAVA:
                        try {
                            Enum.valueOf(Strategy.class, enumStringValue);
                        } catch (IllegalArgumentException e) {}
                        break;
                    case GUAVA:
                        Enums.getIfPresent(Strategy.class, enumStringValue);
                        break;
                    case APACHE_COMMONS_LANG:
                        org.apache.commons.lang.enums.EnumUtils.getEnum(Strategy.class, enumStringValue);
                        break;
                    case APACHE_COMMONS_LANG3:
                        org.apache.commons.lang3.EnumUtils.getEnum(Strategy.class, enumStringValue);
                        break;
                    case MY_OWN_CUSTOM_LOOKUP:
                        Strategy.toEnum(enumStringValue);
                        break;
                }
            }
        }
    }

}
  • You really should use JMH for benchmarking as using strategy makes the result order-dependent (see e.g. my comments on this [answer](https://codereview.stackexchange.com/a/52344/14363) and sometimes wrong by orders of magnitude. Any *Java* benchmark taking less than a few seconds is worthless. +++ Note that reading `WeakHashMap` without synchronization is not about cheap reads - it's totally broken and [that's not only theory](https://access.redhat.com/solutions/55161). – maaartinus Dec 06 '17 at 01:16
  • When misses are very rare, the `valueOf` solution looks pretty good. As long as you don't care about class unloading, there's no reason not to use an `ImmutableMap`. In case you do, you could build a lock-free `WeakHashMap` (probably too much work, but funny). – maaartinus Dec 06 '17 at 01:19
2

I guess, the reason is simply that enumConstantCache is a WeakHashMap, which is not thread-safe.

Two threads writing to the cache at the same time could end up with an endless loop or alike (at least such thing happened with HashMap as I tried it years ago).

I guess, you could use DCL, but it mayn't be worth it (as stated in a comment).

Further on if synchronization is indeed necessary, why do it so inefficiently? One would hope if enum is present in cache, it can be retrieved without locking. Only lock if cache needs to be populated.

This may get too tricky. For visibility using volatile, you need a volatile read paired with a volatile write. You could get the volatile read easily by declaring enumConstantCache to be volatile instead of final. The volatile write is trickier. Something like

enumConstantCache = enumConstantCache;

may work, but I'm not sure about that.

10 threads, each one having to convert String values to Enums and then perform some task

The Map access is usually way faster than anything you do with the obtained value, so I guess, you'd need much more threads to get a problem.


Unlike HashMap, the WeakHashMap needs to perform some cleanup (called expungeStaleEntries). This cleanup gets performed even in get (via getTable). So get is a modifying operation and you really don't want to execute it concurrently.

Note that reading WeakHashMap without synchronization means performing mutation without locking and it's plain wrong and that's not only theory.

You'd need an own version of WeakHashMap performing no mutations in get (which is simple) and guaranteeing some sane behavior when written during read by a different thread (which may or may not be possible).

I guess, something like SoftReference<ImmutableMap<String, Enum<?>> with some re-loading logic could work well.

maaartinus
  • 44,714
  • 32
  • 161
  • 320