I'm experiencing a very strange issue. So, the background is that we have a mapping between Word ContentControl
and a custom object we use to store some information related to the content inside that control. We use a SortedList<ContentControl, OurCustomObject>
to maintain this mapping. The SortedList part is useful to be able to find the next/previous content control, as well as to be able to quickly access the object associated with a content control.
To set this up, we do something like the following:
var dictOfObjs = Globals.ThisAddIn.Application.ActiveDocument.ContentControls
.Cast<ContentControl>()
.ToDictionary(key => key, elem => new OurCustomObject(elem));
var comparer = Comparer<ContentControl>
.Create((x, y) => x.Range.Start.CompareTo(y.Range.Start));
var list = new SortedList<ContentControl, OurCustomObject>(dictOfObjs, storedcomparer);
This seemed to work pretty well, but I recently tried it on a document with ~5000 content controls, and it slowed to an absolute crawl (3+ minutes to instantiate the SortedList).
So that's strange enough, but even more strangeness was yet to come. I added some logging to figure out what was going on, and found that logging the start of each ContentControl
before using them in the list sped it up by a factor of ~60. (Yes, ADDING logging sped it up!). Here is the much faster code:
var dictOfObjs = Globals.ThisAddIn.Application.ActiveDocument.ContentControls
.Cast<ContentControl>()
.ToDictionary(key => key, elem => new OurCustomObject(elem));
foreach (var pair in dictOfObjs)
{
_logger.Debug("Start: " + pair.Key.Range.Start);
}
var comparer = Comparer<ContentControl>
.Create((x, y) => x.Range.Start.CompareTo(y.Range.Start));
var list = new SortedList<ContentControl, OurCustomObject>(dictOfObjs, storedcomparer);
The constructor for SortedList calls Array.Sort<TKey, TValue>(keys, values, comparer);
on the keys and values of the dictionary. I can't figure out why accessing the Range objects in a loop beforehand would speed it up. Maybe something to do with the order in which they are accessed? The foreach loop will access them in the order they appear in the document, while Array.Sort will hop around all over..
Edit: When I say SortedList, I mean System.Collections.Generic.SortedList<TKey, TValue>
. Here is the code for the constructor I'm using:
public SortedList(IDictionary<TKey, TValue> dictionary, IComparer<TKey> comparer)
: this((dictionary != null ? dictionary.Count : 0), comparer) {
if (dictionary==null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.dictionary);
dictionary.Keys.CopyTo(keys, 0);
dictionary.Values.CopyTo(values, 0);
Array.Sort<TKey, TValue>(keys, values, comparer);
_size = dictionary.Count;
}