3

I'm enumerating ConcurrentDictionary, I need to be sure I don't miss any initial item. In other words, I need to be sure I enumerate all initial items.

Initial items: all items in dictionary when the enumeration starts.

The documentation says:

The enumerator returned from the dictionary is safe to use concurrently with reads and writes to the dictionary, however it does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after

But it is not clear if all initial items are enumerated. So I tested it with the following code:

public class Program
{
    static volatile bool enumeratioCompleted;
    static volatile bool enumerationStarted;
    static int itemsAddedInWorkerThread;
    static ConcurrentDictionary<Guid, object> concurrentDic = new ConcurrentDictionary<Guid, object>();

    public static void Main(string[] args)
    {
        var dic = new Dictionary<Guid, object>();
        const int initialItems = 100_000;
        const int workerThreadCount = 4;

        for (int i = 1; i < initialItems; i++)
        {
            var key = Guid.NewGuid();
            var value = new object();
            dic.Add(key, value);
            concurrentDic.TryAdd(key, value);
        }

        var workerThreads = new Thread[workerThreadCount];
        for (var i = 0; i < workerThreadCount; i++)
        {
            workerThreads[i] = new Thread(AddItemsToConcurrentDicWhileEnumerating);
            workerThreads[i].Start();
        }
        int enumeratedItems = 0;
        foreach (var kv in concurrentDic)
        {
            if (enumerationStarted == false) enumerationStarted = true;
            enumeratedItems++;
            dic.Remove(kv.Key);
        }
        enumeratioCompleted = true;
        for (var i= 0; i < workerThreadCount; i++)
        {
            workerThreads[i].Join();
        }
        Console.WriteLine($"Initial items {initialItems}");
        Console.WriteLine($"Initial items not enumerated: {dic.Count}");
        Console.WriteLine($"Items enumerated: {enumeratedItems}");
        Console.WriteLine($"Items added in worker thread: {itemsAddedInWorkerThread}");
    }

    static void AddItemsToConcurrentDicWhileEnumerating()
    {
        while (enumerationStarted == false) ;
        while (enumeratioCompleted == false)
        {
            var key = Guid.NewGuid();
            var value = new object();
            concurrentDic.TryAdd(key, value);
            Interlocked.Increment(ref itemsAddedInWorkerThread);
        }
    }
}

It outputs something like the following:

Initial items 100000
Initial items not enumerated: 0
Items enumerated: 108301
Items added in worker thread: 136729

So it seems that all initial items are enumerated. Please, could you confirm if it is guaranteed or not?

Jesús López
  • 8,338
  • 7
  • 40
  • 66

3 Answers3

2

No, Microsoft makes no other guarantees regarding the enumeration of the ConcurrentDictionary<K,V>, other than it's "safe". That's it. This collection could (theoretically) always return an empty sequence, and still comply with the current state of the specification. For insight, you could check out this GitHub issue. In reality though, this collection has a reasonable behavior. It behaves just like anyone would expect it to behave. Some of the options you have:

  1. Switch to a normal Dictionary<K,V>, protected with a lock. Good if you update it frequently and enumerate it rarely.
  2. Switch to an exotic ImmutableDictionary<K,V>, and use the ImmutableInterlocked class to update it in a lock-free manner. Good if you update it rarely and enumerate it frequently.
  3. Keep using the ConcurrentDictionary<K,V>, and rely on Microsoft's general reluctancy¹ at making changes that could break existing code. You could also write a couple of unit tests, in order to observe ASAP and have the time to react promptly to such a change.

¹ These people tend to be progressive when fantasizing what great changes will make to their codebase in the future, and so they like to publish vague specs that don't restrict them at preserving their current implementations. And they tend to be conservative when actual ideas about better implementations emerge, out of fear that there is code out there relying on the current undocumented behavior.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
1

I want to use ConcurrentDictionary to store users in my server program, I need to use the enumerator to broadcast messages while users may enter or leave the server, so I have the same question.

I review the source code and I believe the current implementation guarantees the initial items are enumerated. The comment on the Enumerator class shows the principle of the enumerator.

// Provides a manually-implemented version of (approximately) this iterator:
VolatileNodeWrapper[] buckets = _tables._buckets;
for (int i = 0; i < buckets.Length; i++)
    for (Node? current = buckets[i]._node; current is not null; current = current._next)
        yield return new KeyValuePair<TKey, TValue>(current._key, current._value);

The enumerator saves a reference to the recent _tables._buckets at the beginning so that the enumeration process will not be affected by table growth or shrinkage. The buckets array has the headers of each linked list, the elements will only be set or removed, and they will not be moved, so the loop can traverse all the buckets.

I hope this answer is helpful for those who try to use the enumerator parallelly but lack confidence.

shingo
  • 18,436
  • 5
  • 23
  • 42
0

Hopefully this will make it clearer.

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace ConsoleApp3
{
    public class Program
    {
        static volatile bool enumeratioCompleted;
        static volatile bool enumerationStarted;
        static int itemsAddedInWorkerThread;
        static ConcurrentDictionary<Guid, object> concurrentDic = new ConcurrentDictionary<Guid, object>();

        public static void Main(string[] args)
        {
            var dic = new Dictionary<Guid, object>();
            const int initialItems = 100_000;
            const int workerThreadCount = 4;

            for (int i = 0; i < initialItems; i++)
            {
                var key = Guid.NewGuid();
                var value = new object();
                dic.Add(key, value);
                concurrentDic.TryAdd(key, value);
            }

            Console.WriteLine($"Initial items {initialItems}");

            var workerThreads = new Thread[workerThreadCount];
            for (var i = 0; i < workerThreadCount; i++)
            {
                workerThreads[i] = new Thread(AddItemsToConcurrentDicWhileEnumerating);
                workerThreads[i].Start(i+1);
            }
            Console.WriteLine($"Number of items in concurrent dictionary right now: {concurrentDic.Count}");
            int itemsremovedduring1stiteration = 0;
            int enumeratedItems = 0;
            foreach (var kv in concurrentDic)
            {
                if (enumerationStarted == false) enumerationStarted = true;

                if (dic.ContainsKey(kv.Key)) // is it an item from our "initial" list?
                {
                    dic.Remove(kv.Key);
                    concurrentDic.TryRemove(kv.Key, out _);

                    itemsremovedduring1stiteration++;
                }

                enumeratedItems++;
            }
            Console.WriteLine($"Items added in worker thread: {itemsAddedInWorkerThread}");
            Console.WriteLine($"Number of items in concurrent dictionary right now: {concurrentDic.Count}");
            Console.WriteLine($"Items removed from concurrent dictionary: {itemsremovedduring1stiteration}");
            Console.WriteLine($"Items enumerated in 1st enumeration: {enumeratedItems}");
            enumeratioCompleted = true;
            for (var i = 0; i < workerThreadCount; i++)
            {
                workerThreads[i].Join();
            }
            Console.WriteLine($"Items added in worker thread: {itemsAddedInWorkerThread}");
            Console.WriteLine($"Items still left in concurrent dictionary: {concurrentDic.Count}");
            int enumeratedItems2nd = 0;
            foreach (var kv in concurrentDic)
            {
                if (!concurrentDic.TryRemove(kv.Key, out _))
                {
                    System.Diagnostics.Debugger.Break();
                }
                else
                    enumeratedItems2nd++;
            }
            Console.WriteLine($"Items enumerated in 2nd enumeration: {enumeratedItems2nd}");
        }

        static void AddItemsToConcurrentDicWhileEnumerating(object data)
        {
            int threadno = (int)data;

            Console.WriteLine($"Adding items from thread {threadno} ...");

            while (enumerationStarted == false) ;
            while (enumeratioCompleted == false)
            {
                var key = Guid.NewGuid();
                var value = new object();
                if (concurrentDic.TryAdd(key, value) == false)
                {
                    System.Diagnostics.Debugger.Break();
                }
                Interlocked.Increment(ref itemsAddedInWorkerThread);
            }
        }
    }
}

enter image description here

Colin Smith
  • 12,375
  • 4
  • 39
  • 47
  • I think he wanted to remove things from `dic`, not the `concurrentDic`. He's enumerating over the concurrent dictionary and removing the initial items from `dic`, expecting to see 0 if all initial items were iterated in the concurrent dictionary – asaf92 Oct 03 '21 at 09:56
  • @asaf92 Yes that is what I wanted – Jesús López Oct 03 '21 at 10:59