4

Recently, I needed to choose between using SortedDictionary and SortedList, and settled on SortedList.

However, now I discovered that my C# program is slowing to a crawl when performing SortedList.Count, which I check using a function/method called thousands of times.

Usually my program would call the function 10,000 times within 35 ms, but while using SortedList.Count, it slowed to 300-400 ms, basically 10x slower.

I also tried SortedList.Keys.Count, but this reduced my performance another 10 times, to over 3000 ms.

I have only ~5000 keys/objects in SortedList<DateTime, object_name>. I can easily and instantly retrieve data from my sorted list by SortedList[date] (in 35 ms), so I haven't found any problem with the list structure or objects its holding.

Is this performance normal?

What else can I use to obtain the number of records in the list, or just to check that the list is populated? (besides adding a separate tracking flag, which I may do for now)

CORRECTION: Sorry, I'm actually using: ConcurrentDictionary<string, SortedList<DateTime, string>> dict_list = new ConcurrentDictionary<string, SortedList<DateTime, string>>(); And I had various counts in various places, sometimes checking items in the list and other times in ConcurrentDicitonary. So the issue applies to ConcurrentDicitonary and I wrote quick test code to confirm this, which takes 350 ms, without using concurrency. Here is the test with ConcurrentDicitonary, showing 350 ms:

public static void CountTest()
{
    //Create test ConcurrentDictionary
    ConcurrentDictionary<int, string> test_dict = new ConcurrentDictionary<int, string>();
    for (int i = 0; i < 50000; i++)
    {
        test_dict[i] = "ABC";
    }

    //Access .Count property 10,000 times
    int tick_count = Environment.TickCount;
    for (int i = 1; i <= 10000; i++)
    {
        int dict_count = test_dict.Count;
    }

    Console.WriteLine(string.Format("Time: {0} ms", Environment.TickCount - tick_count));
    Console.ReadKey();
}
Kon Rad
  • 174
  • 2
  • 10
  • The actual question then is: why do you *check* the property that often? Don´t do it if you think it takes too long. Store it into a variable instead. – MakePeaceGreatAgain Dec 23 '16 at 08:58
  • Couldn't you use SortedList.Any() if you just want to check if the list is populated? – spersson Dec 23 '16 at 08:59
  • 9
    SortedList.Count just returns the value of private variable, it cannot be slow. Are you sure you are not using LINQs Count() method? – Evk Dec 23 '16 at 08:59
  • 1
    Do a quick check to ensure you are getting the Count property (which should be O(1) according to the docs) and not the Count() method from Linq which could be pretty slow. – Stephan Dec 23 '16 at 09:01
  • I never use LINQ and don't want to use it, specifically because I want to have full control of my program and understand everything its doing. – Kon Rad Dec 23 '16 at 09:02
  • 3
    This is the code of `Count` `public virtual int Count { get { return _size; } }` from [Reference Source](https://referencesource.microsoft.com/#mscorlib/system/collections/sortedlist.cs,650e966119d0cc88). – Alessandro D'Andria Dec 23 '16 at 09:02
  • 2
    Well then the reason of your slow perfomance is elsewhere, not in the Count call. – Evk Dec 23 '16 at 09:03
  • I mean I only use .Count, not .Count()... And the slow performance is bugging me because now I'll have to go through various other parts of the system and hunt for fixes. Could this be a bug in.NET? I'm compiling console app under .NET 4.6.1 – Kon Rad Dec 23 '16 at 09:04
  • OK, I'll triple check and post back. – Kon Rad Dec 23 '16 at 09:05
  • 3
    If you provide verifyable example, maybe we can help. – Evk Dec 23 '16 at 09:05
  • Is that a huge object that you are passing in the `SortedList`? – Prashanth Benny Dec 23 '16 at 09:08
  • 3
    The implementation of `SortedList.Count` is literally `return _size;` so therefore you are not timing what you think you are timing. You will have to provide a small repro. – Matthew Watson Dec 23 '16 at 09:11
  • Oops, indeed, tunnel vision. I'm actually using: ConcurrentDictionary> dict_list = new ConcurrentDictionary>(); And I had various counts in various places, sometimes checking items in the list and other times in ConcurrentDicitonary. So the issue applies to ConcurrentDicitonary and I wrote quick test code to confirm this, which takes 350 ms, without using concurrency. Should I delete and and repost the question? – Kon Rad Dec 23 '16 at 09:44
  • 1
    @KonRad All these classes just return a field. You should delete the question and repost a question that actually reproduces the issue, if there is any. Are you sure it isn't your *testing code* you are timing? – Panagiotis Kanavos Dec 23 '16 at 09:55
  • Thanks everyone. I'm getting a message that I can post a new question every 90 minutes. But for now I added sample code above, which reproduces the issue with ConcurrentDictionary, indeed not SorryListed. – Kon Rad Dec 23 '16 at 10:00
  • 1
    @KonRad you'll have to delete all irrelevant text from the question. Anything before your `Correction` simply causes confusion. – Panagiotis Kanavos Dec 23 '16 at 10:11
  • As for [ConcurrentDictionary.Count](https://github.com/dotnet/corefx/blob/master/src/System.Collections.Concurrent/src/System/Collections/Concurrent/ConcurrentDictionary.cs#L970) the [source](https://github.com/dotnet/corefx/blob/master/src/System.Collections.Concurrent/src/System/Collections/Concurrent/ConcurrentDictionary.cs#L970) shows that it isn't a simple field because taking a count requires blocking multithreaded operations while counting. *Why* is this an issue though? Are you calling `Count` that frequently? – Panagiotis Kanavos Dec 23 '16 at 10:11
  • Are you using `ConcurrentDictionary` to emulate a different collection, perhaps a Cache? A `Count` isn't something people care about when working with dictionaries – Panagiotis Kanavos Dec 23 '16 at 10:12

3 Answers3

10

this article recommends calling this instead:

dictionary.Skip(0).Count()

The count could be invalid as soon as the call from the method returns. If you want to write the count to a log for tracing purposes, for example, you can use alternative methods, such as the lock-free enumerator

Omu
  • 69,856
  • 92
  • 277
  • 407
3

Well, ConcurrentDictionary<TKey, TValue> must work properly with many threads at once, so it needs some synchronization overhead.

Source code for Count property: https://referencesource.microsoft.com/#mscorlib/system/Collections/Concurrent/ConcurrentDictionary.cs,40c23c8c36011417

    public int Count
    {
        [SuppressMessage("Microsoft.Concurrency", "CA8001", Justification = "ConcurrencyCop just doesn't know about these locks")]
        get
        {
            int count = 0;

            int acquiredLocks = 0;
            try
            {
                // Acquire all locks
                AcquireAllLocks(ref acquiredLocks);

                // Compute the count, we allow overflow
                for (int i = 0; i < m_tables.m_countPerLock.Length; i++)
                {
                    count += m_tables.m_countPerLock[i];
                }

            }
            finally
            {
                // Release locks that have been acquired earlier
                ReleaseLocks(0, acquiredLocks);
            }

            return count;
        }
    }

Looks like you need to refactor your existing code. Since you didn't provide any code, we can't tell you what you could optimize.

apocalypse
  • 5,764
  • 9
  • 47
  • 95
  • Well, this is a large system where I'm caching data from multiple sources in `ConcurrentDictionary>`, then each provider is checking if the data is already loaded. But overall I'm using `ConcurrentDictionary` and `SortedList` in variety of ways in various places, understanding concurrency implications. But the slow Count threw me off and I'm looking for a simplest replacement... – Kon Rad Dec 23 '16 at 10:16
  • Note I never had issues with populating and accessing them, thus it seems strange that everything performs great, except for the Count, which I added a few weeks ago and then spend days looking for reasons why my system is slowing to a crawl. I expect others may not even realize that simply using Count may have serious performance implications. – Kon Rad Dec 23 '16 at 10:19
  • @KonRad: I still don't get it. Each provider is calling `Count`, and then if `Count > 0` it tries to load some entires? – apocalypse Dec 23 '16 at 11:14
0

For performance sensitive code I would not recommend to use ConcurrentDictionary.Count property as it has a locking implementation. You could use Interlocked.Increment instead to do the count yourself.

Dogu Arslan
  • 3,292
  • 24
  • 43