10

Back to concurrency. By now it is clear that for the double checked locking to work the variable needs to be declared as volatile. But then what if double checked locking is used as below.

class Test<A, B> {

    private final Map<A, B> map = new HashMap<>();

    public B fetch(A key, Function<A, B> loader) {
        B value = map.get(key);
        if (value == null) {
            synchronized (this) {
                value = map.get(key);
                if (value == null) {
                    value = loader.apply(key);
                    map.put(key, value);
                }
            }
        }
        return value;
    }

}

Why does it really have to be a ConcurrentHashMap and not a regular HashMap? All map modification is done within the synchronized block and the code doesn't use iterators so technically there should be no "concurrent modification" problems.

Please avoid suggesting the use of putIfAbsent/computeIfAbsent as I am asking about the concept and not the use of API :) unless using this API contributes to HashMap vs ConcurrentHashMap subject.

Update 2016-12-30

This question was answered by a comment below by Holger "HashMap.get doesn’t modify the structure, but your invocation of put does. Since there is an invocation of get outside of the synchronized block, it can see an incomplete state of a put operation happening concurrently." Thanks!

oᴉɹǝɥɔ
  • 1,796
  • 1
  • 18
  • 31
  • 1
    Because the map will resize after some puts and it will certainly break your unlocked get calls. – Thomas Jungblut Oct 26 '15 at 16:14
  • 5
    putIfAbsent/computeIfAbsent are atomic and not just for convenience – Sleiman Jneidi Oct 26 '15 at 16:14
  • Thomas, could you elaborate on "break get calls" please. I have only found explicit references to concurrent puts and iterators in the docs and when searching for "hashmap concurrent access" – oᴉɹǝɥɔ Oct 26 '15 at 16:32
  • The main problem is that `B value = map.get(key)` could return a valid reference (non null) pointing to an inconsistent object (not properly published). I don't think it could "break" the map (if you completely remove the synchronized block then yes your map could be broken)... – assylias Oct 26 '15 at 16:38
  • @assylias When looking at Java 8 HashMap source I don't see how this is possible. However setting the concrete implementation aside and speaking conceptually I can see how this could be a valid point – oᴉɹǝɥɔ Oct 26 '15 at 16:45
  • @cherio: you can’t see it by looking at the source code. Multithreading without proper synchronization isn’t that easy. – Holger Oct 27 '15 at 23:25

1 Answers1

19

This question is muddled on so many counts that its hard to answer.

If this code is only ever called from a single thread, then you're making it too complicated; you don't need any synchronization. But clearly that's not your intention.

So, multiple threads will call the fetch method, which delegates to HashMap.get() without any synchronization. HashMap is not thread-safe. Bam, end of story. Doesn't even matter if you're trying to simulate double-checked locking; the reality is that calling get() and put() on a map will manipulate the internal mutable data structures of the HashMap, without consistent synchronization on all code paths, and since you can be calling these concurrently from multiple threads, you're already dead.

(Also, you probably think that HashMap.get() is a pure read operation, but that's wrong too. What if the HashMap is actually a LinkedHashMap (which is a subclass of HashMap.) LinkedHashMap.get() will update the access order, which involves writing to internal data structures -- here, concurrently without synchronization. But even if get() is doing no writing, your code here is still broken.)

Rule of thumb: when you think you have a clever trick that lets you avoid synchronizing, you're almost certainly wrong.

Brian Goetz
  • 90,105
  • 23
  • 150
  • 161
  • would the second check value==null be discarded by the compiler? – Sleiman Jneidi Oct 26 '15 at 16:56
  • Thanks for your answer. It not just explains the problem but also gives a valuable information about implementation details of `get()` method. – AlexR Oct 26 '15 at 16:56
  • Brian, I agree with the first half of the answer. The contract is that HashMap is not synchronized and should be used as such. The second half is bogus because the question lists concrete classes. And since you mentioned the implementation then according to the supplied Java source HashMap.get doesn't modify the structure in the current Java7/8 implementation. And again, I agree to the first half and I would have accepted it as the answer if there were an authoritative reference covering HashMap.get – oᴉɹǝɥɔ Oct 26 '15 at 17:12
  • 3
    @cherio: `HashMap.get` doesn’t modify the structure, but your invocation of `put` does. Since there is an invocation of `get` outside of the `synchronized` block, it can see an incomplete state of a `put` operation happening concurrently. This errant read can cause a lot of counter-intuitive behavior. – Holger Oct 27 '15 at 23:33
  • @cherio I don't think you understand yet. *Even if* `get()` were a pure read operation, it would *still not* be safe to call without the lock held. If the map is protected by a lock, then *all* access to the map must be with that lock held. – Brian Goetz Oct 31 '15 at 21:26