14

Let's say I have a concurrent map with collections as value:

Map<Integer, List<Integer> map = new ConcurrentHashMap<>();
map.putIfAbsent(8, new ArrayList<>());

and I update the value as follows:

map.computeIfPresent(8, (i, c) -> {
    c.add(5);
    return c;
});

I know that computeIfPresent entire method invocation is performed atomically. However, considering this map is accessed by multiple threads concurrently, I'm a little bit concerned about data visibility of the modifications done to the underlying collection. In this case will the value 5 be seen in the list after calling map.get

My question is will change to the list be visible in other threads upon calling map.get if changes are performed within computeIfPresent method call.

Please note that I am aware that changes to the list will not be visible if I were to take reference to the list before doing the update operation. I am unsure if the changes to the list will be visible if I take reference to the list (by calling map.get) after the update operation.

I am unsure how to interpret the docs, but it seems to me that happens-before relationship will guarantee visibility of the changes to the underlying collection in this particular case

More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value

Marin Veršić
  • 404
  • 4
  • 10

3 Answers3

4

The fact that that method is documented to be atomic, does not mean much about visibility (unless that is part of the documentation). For example to make this simpler:

// some shared data
private List<Integer> list = new ArrayList<>();

public synchronized void addToList(List<Integer> x){
     list.addAll(x);
}

public /* no synchronized */ List<Integer> getList(){
     return list;
}

We can say that addToList is indeed atomic, only one thread at a time can call it. But once a certain thread calls getList - there is simply no guarantee about visibility (because for that to be established, it has to happens on the same lock ). So visibility is something happens before is concerned with and computeIfPresent documentation does not say anything about that, at all.

Instead the class documentation says:

Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove).

The key point here is obviously overlap, so some other threads calling get (thus getting a hold of that List), can see that List in some state; not necessarily a state where computeIfPresent started (before you actually called get). Be sure to read further to understand what that some actually might mean.

And now to the trickiest part of that documentation:

Retrievals reflect the results of the most recently completed update operations holding upon their onset. More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value.

Read that sentence about completed again, what it says is that the only thing you can read when a thread does get is the last completed state that List was in. And now the next sentence says that there is a happens before established between two actions.

Think about it, a happens-before is established between two subsequent actions (like in the synchronized example above); so internally when you update a Key there could be a volatile written signaling that update has finished (I am pretty sure it's not done this way, just an example). For the happens before to actually work, get has to read that volatile and see the state that was written to it; if it sees that state it means that happens before has been established; and I guess that by some other technique this is actually enforced.

So to answer your question, all the threads calling get will see the last completed action that happened on that key; in your case if you can guarantee that order, I'd say, yes, they will be visible.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • if I wasn't clear in my question, I can guarantee that `map.get` will be called after `map.computeIfPresent`. So from your answer I conclude that changes to the underlying list will be visible. I will try to emphasize that in the question – Marin Veršić Sep 30 '18 at 20:30
  • @MarinVeršić I've reworded a bit and answered. whatever you do leave this question un-answered for some time, I've triggered one of the people I respect here most, and I sure hope that tomorrow we will get a hell of a good answer. – Eugene Sep 30 '18 at 21:39
  • wow, I really appreciate the effort. I hope I've given enough info to understand the question – Marin Veršić Sep 30 '18 at 21:42
  • @MarinVeršić keep in mind that “after” means being subsequent in [Synchronization Order](https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.4). E.g. slowing down a thread via `Thread.sleep` or counting loops, as well as exchanging information via shared variables in a data race do not count. On the other hand, any action which is truly *after* the other in the synchronization order, is already sufficient for establishing a *happens-before* relationship on its own ad doesn’t even need guarantees from the map. – Holger Oct 01 '18 at 07:29
  • @Holger what about if we had `computeIfAbsent(list, c) -> c.addAll(list)`. In my understanding since the documentation says *Retrievals reflect the results of the most recently completed update operations holding upon their onset*, when some thread calls `get` is can't see some partial update from that `addAll` - it either sees all elements being added from `addAll` or none. Is my understand correct? – Eugene Oct 01 '18 at 08:25
  • @Eugene that’s not valid syntax and their’s no point trying to perform an `addAll` on an *absent* mapping, so it’s not clear what you are aiming at. Especially, as you already wrote in your answer that `get` may overlap with update operations… – Holger Oct 01 '18 at 08:44
  • @Holger shoot! that was suppose to be `computeIfPresent`... I don't understand the guarantees given by that `completed` remark taken with `overlap`. On one hand `get` can see *some intermediate* results based on `overlap`; but than what is `completed` suppose to offer – Eugene Oct 01 '18 at 08:46
  • 1
    @Eugene consider `map.computeIfPresent("key", (key,list) -> { list = new ArrayList<>(list); list.add("foo"); list.add("bar"); return list; });`. Since `ArrayList` is not thread safe, this still relies on safe publishing and CHM provides you the necessary guarantees, i.e. the new instance is not accessed, even if `get` operations overlap, but after completion, there is a happen-before relationship to each subsequent retrieval “reporting the updated value”. In other words, if you see the new reference, it will be correctly populated. – Holger Oct 01 '18 at 09:01
  • @Holger I think this makes a bit of sense now. As I take it `get` can overlap with an `update`, but it can't see an intermediate result (like let's say only `foo` being added *without* `bar`); it either sees both or none being added. If I have no idea what happened *before* `get`, all I can say is that *some* update operation happened for sure; if I *can* guarantee that `computeIfPresent` happened before my `get` - I will see the List containing both `foo` and `bar` for sure. Right? I wish I could ping you less, but SO is full of wrong answers (or incomplete) :( – Eugene Oct 01 '18 at 09:59
  • 1
    @Eugene the crucial point is that in my example, an new list instance is created which can only be seen when the update has been completed. An overlapping `get` will still see the old reference, to the object which is not touched. – Holger Oct 01 '18 at 10:37
  • 1
    @Holger what if that wasn't the case? What if I would update the value in place, putting `foo` and `bar` into it? – Eugene Oct 01 '18 at 10:39
2

c.add(5) is not thread safe, the internal state of the c is not protected by the map.

The exact way of making individual values and insert-use-remove combinations thread safe and race condition free depends on the usage pattern (synchronized wrapper, copy-on-write, lock free queue, etc).

bobah
  • 18,364
  • 2
  • 37
  • 70
  • and how would you comment this from the docs: `More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value`? If there is a happens-before relation then effects of the change made in the update function modifying the list should be visible to other threads after the operation is finished? If, after the update operation is finished I call `map.get`, shouldn't I get a reference to the updated list? – Marin Veršić Sep 30 '18 at 19:40
  • @MarinVeršić - You do not necessarily have to touch the internal state of the value _while_ retrieving it (when it is protected by the map), you can do it after, from several threads concurrently. Map only serializes operations with itself, not with the objects it stores. – bobah Sep 30 '18 at 20:05
  • @boboah I am only interested in this specific situation where I take the reference to the list from the map with `map.get` strictly after the update operation was performed within the `map.computeIfPresent` method call. So you would still say that visibility will not be guaranteed even in this case? – Marin Veršić Sep 30 '18 at 20:09
  • 1
    For that one thread there is definitely happens-before, but not for any another thread having the list ref already cached. Also I am almost sure gets are serialized with writes (easy to check - make write last long using a future and see if getter thread also hangs). – bobah Sep 30 '18 at 20:17
  • great, I can guarantee that no thread will have a reference to the list before the update operation is performed. – Marin Veršić Sep 30 '18 at 20:20
  • @MarinVeršić - and also that an update operation is not performed once they have that reference. Reading `ArrayList` while something is writing to it is also not thread safe. – bobah Oct 01 '18 at 06:59
1

To clarify your question:

You're providing some external guarantee such that Map.computeIfPresent() is called before Map.get().

You haven't stated how you're doing this, but let's say you're doing it by using something with happens-before semantics provided by the JVM. If this is the case then that alone guarantees List.add() is visible to the thread calling Map.get() simply by association of the happens-before relationship.

Now to answer the question you're actually asking: As you've noted, there's a happens-before relation between update operation ConcurrentHashMap.computeIfPresent() and the subsequent calling of the access method ConcurrentMap.get(). And naturally, there's a happens-before relation between List.add() and the end of ConcurrentHashMap.computeIfPresent().

Put together, answer is yes.

There's a guarantee the other thread will see 5 in the List obtained through Map.get(), provided you're guaranteeing Map.get() is actually called after computeIfPresent() ends (as stated in the question). If the latter guarantee goes foul and Map.get() is somehow called before computeIfPresent() ends, there's no guarantees to what the other thread will see since ArrayList is not thread-safe.

antak
  • 19,481
  • 9
  • 72
  • 80
  • in such a case what does that *happens-before* from documentation mean at all(that the OP has shown)? And how come that is not an update operation? – Eugene Oct 01 '18 at 03:05
  • @Eugene You're right, it's an update operation. I was thinking of something else entirely. Well that screws up the premise of my answer... – antak Oct 01 '18 at 03:16
  • Editted: backflip on conclusion – antak Oct 01 '18 at 03:44
  • @antak yes I can guarantee that there will be happens-before relation between update operation ConcurrentHashMap.computeIfPresent() and the access. This is guaranteed by the fact that other thread will call `ConcurrentMap.remove`(not `ConcurrentMap.get`) which would mean that update is either performed (5 is added) or it is skipped if the map entry is removed – Marin Veršić Oct 01 '18 at 08:15
  • @antak *there's no guarantees to what the other thread will see since ArrayList is not thread-safe* how about the fact that documentation says that *Retrievals reflect the results of the most recently completed update operations holding upon their onset* - completed being the keyword here. So in my understanding if we had something like `computeIfPresent(list, c) -> c.addAll(list)`, when you call `get` you can't see some intermediary state, like let's say a List with only some partial adding from `addAll`, if I make sense – Eugene Oct 01 '18 at 08:22
  • @Eugene documentation says that: ` The entire method invocation is performed atomically`, so I would humbly conclude that if the result of the update is visible, it should be entirely visible – Marin Veršić Oct 01 '18 at 08:40
  • @MarinVeršić to me, atomic does not mean yet visible without a happens before guarantee; but your conclusion either way seems correct to me; and as such this answer needs a bit rewording to make it correct – Eugene Oct 01 '18 at 08:42
  • @Eugene What you're saying is true and is valid if `Map.get()` is called after `computeIfPresent()` completes. However, here's a thought experiment on what happens if `Map.get()` was called before that: In the most the most extreme case, `Map.get()` could be called before `computeIfPresent()` even starts. In that case, operations are no different to holding onto a reference to the List somewhere (independent of the ConcurrentMap) and performing unsynchronized access on it. This is because ... – antak Oct 01 '18 at 08:44
  • @antak please note that I'm actually using `map.remove` instead of `map.get` as stated in the previous comment – Marin Veršić Oct 01 '18 at 08:48
  • ... This is because, even though one thread may be `add()`ing inside (and hence *happens-before* the completion of) `computeIfPresent()`, that itself doesn't provide synchronization against the List that was retrieved out of the ConcurrentMap much earlier on. Specifically, the clause cited only applies if `Map.get()` returned the List after `computeIfPresent()`, which was in turn was after `List.add()`, thereby competing the List.add() -> Map.get() happens-before relation by association. – antak Oct 01 '18 at 08:51
  • @MarinVeršić you should have written right in your question that you are talking about `remove`, as `remove` is not a retrieval operation, but an *update* operation. Since update operations are atomic, but retrieval operations are not, it makes a fundamental difference. – Holger Oct 01 '18 at 08:51
  • @Holger I see your point, my mistake. I did state that I can guarantee `map.get` will be called after the update operation, but I didn't provide you with the mechanism for doing so. Would you like me to update the original question? – Marin Veršić Oct 01 '18 at 08:57
  • @MarinVeršić Synchronization with `ConcurrentMap.remove()` [is somewhat involved](https://stackoverflow.com/questions/39341742/does-concurrentmap-remove-provide-a-happens-before-edge-before-get-returns-n) and isn't a topic I feel I could give commentary on in the comments section of this answer. Maybe if you opened another question someone more knowledgeable could offer their expertise. – antak Oct 01 '18 at 09:07