Gist
In Effective Java (Third Edition), the author provided a performance tip in relation to ConcurrentHashMap
---i.e., when using existingValue = map.putIfAbsent(key, value)
, consider first calling existingValue = map.get(key)
and skipping putIfAbsent()
if the key already exists.
Question
Are performance considerations like that mentioned by the author documented anywhere?
I believe it is a sufficiently important / fundamental performance consideration to be documented somewhere official, especially since putIfAbsent()
already returns the value if the key already exists, making the additional get()
seems redundant, and it is not unreasonable that someone unaware of the performance consideration may "refactor" away the get()
check.
Edited to clarify question: I'm not asking why or whether putIfAbsent()
is indeed always more performant, but that whether such performance considerations are documented somewhere, given that the API has been designed such that on a plain interpretation of putIfAbsent()
and get()
being used together as suggested by the author without knowledge of the performance consideration, the get()
seems redundant.
Details
The specific example provided in the book is as follows:
- assuming that we are interested in implementing a method akin to
String.intern()
, which retrieves the value of a particular key on the map, and optionally inserting the key-value into the map if it does not already exists, - then the more performant approach is not to use the provided
previousValue = map.putIfAbsent(key, value)
directly, but to have an additionalpreviousValue = map.get(key)
check before.
For example, the code immediately below is less efficient:
// Concurrent canonicalizing map atop ConcurrentMap - not optimal
private static final ConcurrentMap<String, String> map = new ConcurrentHashMap<>();
public static String intern(String s) {
String previousValue = map.putIfAbsent(s, s);
return previousValue == null ? s : previousValue;
}
Whereas this code fragment below is more efficient:
// Concurrent canonicalizing map atop ConcurrentMap - faster!
public static String intern(String s) {
String result = map.get(s);
if (result == null) {
result = map.putIfAbsent(s, s);
if (result == null) result = s;
}
return result;
}
The reason provided by the author is that the get()
method is more optimized than putIfAbsent()
, which I interpret to mean that it is generally worthwhile to add additional get()
checks to occasionally avoid the putIfAbsent()
call.
I would also assume that the actual performance impact depends on the relatively frequency of insertion of new keys.