2

I have a doubt with the concurrent dictionary in C#.

In another question I was asked how to have a concurrent dictionary with a hashset as value, but it isn't a good idea to work with a hashset, it is better to use a concurrent dictionary as value. So the solution that I get was this:

var myDic = new ConcurrentDictionary<long, ConcurrentDictionary<int, byte>>();
myDic.AddOrUpdate(key, 
    _ => new ConcurrentDictionary<int, byte>(new[] {new KeyValuePair<int, byte>(element, 0)}),
    (_, oldValue) => {
        oldValue.TryAdd(element, 0);
        return oldValue;
    });

Suppose that I have two threads, where "element" is 1 in the thread A and 2 in the thread B.

My doubt is if this is thread safe. I can be wrong but I think that the concurrent dictionary works in this way:

Thread A: try to insert element 1 for key 1. The key 1 doesn't exist, so it try to insert the key 1 with the concurrent dictionary ConcurrentDictionary<int, byte>(new[] {new KeyValuePair<int, byte>(1, 0).

Thread B: tries to insert the item 2 in the dictionary of key 1. Thread A is still adding the new key/value, Thread B thinks that key 1 doesn't exists, so try to add the value ConcurrentDictionary<int, byte>(new[] {new KeyValuePair<int, byte>(2, 0) to the key 1.

Thread A finishes to insert the key/value pair successfully.

Thread B tries to finish, but now it the key 1 exists because thread A inserted the key 1. So the thread B can't insert the key/value.

So what happen? The work of thread B is discard so I will only have one item in the concurrent dictionary for the key 1? Or perhaps thread B enters in the updateValueFactory and add the item 2 to the dictionary?

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Álvaro García
  • 18,114
  • 30
  • 102
  • 193

2 Answers2

5

AddOrUpdate is specifically designed to handle the scenario you described; if it could not handle it gracefully, it would be useless.

When thread B tries to add its computed value, it will fail because the key already exists. It will then automatically try again, at which point it will perform an update instead of an add. Specifically, it will update the value produced by thread A. This is a form of optimistic concurrency: the algorithm assumes it will succeed, so it optimizes for that outcome, but it has a fallback plan in case it fails.

Note, however, that the optimistically concurrent nature of this method means that your addValueFactory and updateValueFactory may be both be called; it's not strictly one or the other. In your hypothetical scenario, thread B would first call into addValueFactory and, because the add fails, later call into updateValueFactory. In cases of racing updates, updateValueFactory may called multiple times before an update finally succeeds.

Mike Strobel
  • 25,075
  • 57
  • 69
  • 3
    Regarding the details (specifically "automatically try again" and the fact that both factories may be called), the source for ConcurrentDictionary.AddOrUpdate() is pretty readable and interesting to take a look at: http://referencesource.microsoft.com/#mscorlib/system/Collections/Concurrent/ConcurrentDictionary.cs,e5f450835d8600ed. The factory methods are called outside of the lock in order to minimize the amount of time spent inside of locks, and it's all in a while loop so that multiple attempts can be made until one succeeds in the case of concurrency. – nlawalker Feb 07 '18 at 18:03
  • Yes, that is one of the ways in which the algorithm is optimistic: it eagerly computes the value it _thinks_ it will need, without acquiring any shared resources to do so. Only when it fails will it try again. And indeed, the sources are worth exploring. – Mike Strobel Feb 07 '18 at 18:05
2

The way you use the ConcurrentDictionary class is brittle. The AddOrUpdate is intended for replacing the value of a key with another value, not for modifying existing values, in case the values are mutable objects. And this is exactly what you are doing inside the updateValueFactory delegate:

(_, oldValue) =>
{
    oldValue.TryAdd(element, 0);
    return oldValue;
}

The oldValue is a ConcurrentDictionary<int, byte>, and is mutated by invoking its TryAdd method. This invocation is not synchronized, it may occur concurrently with an invocation from another thread, and may even be invoked multiple times by each thread. From the documentation:

However, the addValueFactory and updateValueFactory delegates are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, AddOrUpdate is not atomic with regards to all other operations on the ConcurrentDictionary<TKey,TValue> class.

Now it could be possible that this specific usage is accidentally thread-safe, but personally I would avoid using a ConcurrentDictionary like that. It looks like a bug waiting to happen.

Here is how you could rewrite your code, to make it less error-prone, and also clearer regarding its intentions:

var innerDic = myDic.GetOrAdd(key, _ => new ConcurrentDictionary<int, byte>());
innerDic.TryAdd(element, 0);
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104