20

I saw several quotes about this on the Internet but no official documentation? Can anyone tell me where I can get information about this?

CodeCaster
  • 147,647
  • 23
  • 218
  • 272
Nahum
  • 6,959
  • 12
  • 48
  • 69

6 Answers6

34

This isn’t supposed to be documented since it’s an implementation detail.

For instance, there is more than one implementation of SortedDictionary: there’s Microsoft’s and there’s the Mono implementation.

And the Mono implementation does, in fact, use a red-black tree in its current version (2.10.9). So does the current .NET version (you can find that out by decompiling the code, for instance by using Reflector, ildasm.exe or the built-in IL viewer in MonoDevelop).

However, this is probably going to change in the future since there are actually more efficient implementations (as B trees).

So, again: this information isn’t useful, it’s an implementation detail, and it is going to change with high probability.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 2
    This doesn't answer the question. – Greg Graham May 08 '19 at 15:56
  • 6
    @GregGraham How doesn’t it? I’m clearly explaining what the *current* (at the time of answering) implementations are doing, and why this information shouldn’t be relied on in the future. What else are you missing from the answer? – Konrad Rudolph May 08 '19 at 16:54
6

This is the official documentation from MSDN page;

The SortedDictionary generic class is a binary search tree with O(log n) retrieval, where n is the number of elements in the dictionary


Is SortedDictionery a red-black tree?

Well, I'm not too much fimiliar with red-black tree but I just decompiled SortedDictionary class with dotPeek (which is free) but red-black tree's deletion algorithm and SortedDictionary's Remove() method's code doesn't seems similar. So, my money is for No.

SortedDictionary keeps its keys always sorted. It allows you to avoid sorting the keys on your own. Its lookup performance is slower than Dictionary. It has advantages if you require a sorted lookup table in memory.

enter image description here

Dictionary lookup time:       Close to O(1)
SortedDictionary lookup time: O(log n) 

Check out more details from here.

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
  • 3
    Not sure what you decompiled but `SortedDictionary` internally just uses `TreeSet` which is an internal class and, yes, is implemented in terms of a red-black tree. – Konrad Rudolph Feb 16 '13 at 13:11
  • @KonradRudolph Hmm, I decompiled `Remove()` method which its used `internal virtual bool DoRemove(T item)` method but it doesn't seems any fimiliar thing with `red-black tree`'s (Which is a binary tree search ) [deletion algorithm](http://en.wikipedia.org/wiki/Binary_search_tree#Deletion). Am I missing here what? – Soner Gönül Feb 16 '13 at 13:20
  • I haven’t actually got access to the .NET implementation at the moment (I’m on Mac) but I’ve checked related resources and the Rotor reference implementation (which of course is *not* official) and I am all but certain that `SortedDictionary` *does* use `TreeSet` which is a red-black tree. – Konrad Rudolph Feb 16 '13 at 13:28
  • What’s more, the MSDN article in fact *guarantees* that (the current implementation of) `SortedDictionary` is a binary search tree so your assessment of the `DoRemove` method must be erroneous. Having said that, it’s weird that the MSDN is actually specific on this point because as I’ve written in my answer, the fact that a *binary* search tree (rather than a generalised search tree) is used is an implementation detail, and it’s bound to change so this info has no place in the MSDN. – Konrad Rudolph Feb 16 '13 at 13:37
  • 1
    Reference Source confirmes that it's a Red-Black tree (at least in 4.5.2); http://referencesource.microsoft.com/#System/compmod/system/collections/generic/sortedset.cs – alexn Feb 20 '15 at 14:42
5

Documentation does seem to guarantee O(log n) for retrieval from a BST. If they are reporting "on average" with respective to possible trees then even non-balancing implementations can claim that. Even if it were a worse case guarantee, this together with being a BST isn't enough to say whether it is or isn't implemented as a red-black trees without resorting to decompiling. It could also be an AVL, splay, or some other balancing variety.

I pulled out dot peek. On The 4.0.0.0 System Assembly. OrderedDictionary uses Treeset which subclasses SortedSet. This does appear likely to be a red-black tree. However, it is not the typical example similar to many examples on the web these implement balancing after insertion or deletion. The implementation is primarily iterative, and the insert appears to fix the colors on the way down instead of after the insertion (top-down - there are a couple papers on this sort of approach). Something similar was up with the removal, but don't have time to verify it. Certainly not something directly comparable.

At the very least, my guess is it should have a similar kind of run-time characteristic. By the time it gets to the insert/deletion point there isn't much it does since it was all done on the way down.

ptthiem
  • 51
  • 1
  • 2
2

From its MSDN page:

The SortedDictionary generic class is a binary search tree with O(log n) retrieval, where n is the number of elements in the dictionary

CodeCaster
  • 147,647
  • 23
  • 218
  • 272
1

You can decompile it (for example with Reflector)... BUT since that is an "implementation detail" I would not rely on it, it could be changed anytime with any update.

Not sure how relevant such an implementation detail is but if you really need a RedBlack tree THEN implement it explicitly... anything else would be "technical debt" / "desaster waiting to happen" IMHO.

Yahia
  • 69,653
  • 9
  • 115
  • 144
  • 1
    Why decompile when Microsoft released .Net as open source? http://referencesource.microsoft.com/#System/compmod/system/collections/generic/sorteddictionary.cs – nivs1978 May 11 '16 at 06:40
0

The real question: "Is SortedDictionary implemented as a balanced tree?" If not then when data is inserted in sorted order the search performance will drop to O(n) just like a sequential Linked List. This will happen if the data is written out to a file (in sorted order) and later read back into an unbalanced tree. It's hard to imagine that Microsoft would not implement this as some sort of balanced tree. They may change how they do it but it doesn't matter much as long as the tree remains fairly well balanced.