I wanted to discuss a specific use I have of a concurrent map to sense check my logic...
If I used ConcurrentHashMap
, I can do the familar
private final ConcurrentHashMap<K, V> map = new ConcurrentHashMap<K, V>();
public V getExampleOne(K key) {
map.putIfAbsent(key, new Object());
return map.get(key);
}
but I realise that a race condition exists whereby if I remove the item from the map between the putIfAbsent
and the get
, the method above would return something that no longer exists in the collection. This may or may not be fine, but lets assume that for my use case, it's not ok.
What I'd really like is to have the whole thing atomic. So,
public V getExampleTwo(K key) {
return map.putIfAbsent(key, new Object());
}
but as this expands out to
if (!map.containsKey(key))
return map.put(key, value); [1]
return map.get(key);
which for line [1] will return null
for first usage (ie, map.put
will return the previous value, which for first time use is null
).
I can't have it return null in this instance
Which leaves me with something like;
public V getExampleThree(K key) {
Object object = new Object();
V value = locks.putIfAbsent(key, object);
if (value == null)
return object;
return value;
}
So, finally, my question; how do the examples above differ in semantics?. Does getExampleThree
ensure atomicity like getExampleTwo
but avoid the null return correctly? Are there other problems with getExampleThree
?
I was hoping for a bit of discussion around the choices. I realise I could use a non ConcurrentHashMap
and synchronize around clients calling my get
method and a method to remove from the map but that seems to defeat the purpose (non blocking nature) of the ConcurrentHashMap. Is that my only choice to keep the data accurate?
I guess that's a bit part of why you'd choose ConcurrentHashMap; that its visible/up-to-date/acurrate at the point you interact with it, but there may be an impact further down the line if old data is going to be a problem...