5

In his talk about Effective Java at 54:15 Joshua Bloch recommends to use get before putIfAbsent in order to improve performance and concurrency. This leads me to the question why this optimization is already not build in like

public V BETTER_putIfAbsent(K key, V value) {
    V result = get(key);
    if (result!=null) return result;
    return ORIGINAL_putIfAbsent(key, value);
}
maaartinus
  • 44,714
  • 32
  • 161
  • 320
  • Your optimization is not threadsafe. – BalusC Apr 16 '11 at 03:23
  • 1
    @BalusC It *is* threadsafe. If result!=null, then I'm done. If it is null, then I do putIfAbsent, which re-checks the value. So in case somebody put the key there between my two calls, it gets detected. – maaartinus Apr 16 '11 at 03:46
  • Yes, right, you don't care anyway if the second value don't get set when both threads get `null`. As to the concrete question, well maybe because he didn't write the class himself and the original author, Doug Lea, didn't immediately think about it. I peeked around on bugs.sun.com, but I couldn't find an enhancement request like that. – BalusC Apr 16 '11 at 04:17
  • Doug Lea never makes mistakes. – irreputable Apr 16 '11 at 05:20
  • just a note: if your map happens to have either its keys or its values (or both) that are primitives, then there's a **very** simple optimization that will own the default Java API big times. Use Trove. Trove is amazing. I use it all the time and its just *that* much better than falling for the *"zomg-everything-is-an-object-we-created-immutable-wrapper-classes-because-we-think-we-re-smart"* mentality. :) – SyntaxT3rr0r Apr 16 '11 at 11:29

2 Answers2

3

I would guess that it is because performance depends on the usage pattern. For a map where most putIfAbsent calls would succeed, then guarding it with a get will be slower. For a map where putIfAbsent will often fail, then guarding it with a get will be faster.

By not making the optimization for the common failure case, you are free to choose which method is faster for your map.

In the example from the video, putIfAbsent will usually fail, so it is faster to guard it with a get.

sbridges
  • 24,960
  • 4
  • 64
  • 71
1

This adds a double checked locking, the transactional semantics remains the same; so it is not wrong.

Whether it is actually an optimization depends on usage. We are always tempted to check for special cases that we know cheaper solutions exist

if A
    cheapSolutionForA();
else
    genericSolution();

This may work, or not - if A is rarely true, the additional check costs more than it saves. (and when A indeed is true on occasion, it can disrupt CPU branch prediction, it could have been cheaper to always go with the generic solution even when A=true)

In Joshua's example, A indeed is true frequently. He must be requesting intern string for the same string(same in value, not in identity) many times, therefore in most calls the map already has the key.

If every call to intern() gets a different string, then the map never has the key, and his optimization backfires - the "optimization" costs more time, and saves none.

Of course, when it comes to string intern, the 1st case is more realistic in practice .

In general though, putIfAbsent() cannot predict how it's being used, so it's unwise to include this special case "optimization" inside it. In many use cases, contention is low, the map most likely doesn't have the key when putIfAbsent is called, it would be wrong in these cases if the "optimzation" is built in.

irreputable
  • 44,725
  • 9
  • 65
  • 93
  • I must have been very tired so I didn't get it myself. Your explanation is clear, but your pseudo-code is not quite right. It would be actually always favorable to try it as you wrote it: Whenever A holds, we do the cheap solution and are done. What I described corresponds with trying the cheap solution and if it fails, doing the generic one, too. – maaartinus Apr 17 '11 at 18:36