This seems like a simple question, but as the other (problematic) answers show, the solution is not really trivial.
Problems with existing answers
As it stands, both proposed solutions result in the creation of some kind of dictionary where each time you enumerate the IEnumerable<Contact>
at any given key, the filtered IEnumerable<Contact>
is recreated from scratch by enumerating and filtering the original collection. Essentially, what you're storing in the dictionary is the logic to get the desired filtered Contact
collections, not the actual collections.
As a result you'll be enumerating the original IEnumerable<Contact>
over and over. This is dangerous from a thread-safety standpoint and even if it works - there is no benefit to doing this, only overheads.
Proposed solution
You are right in that the best out-of-the-box thread-safe alternative for Lookup/ILookup<TKey, TValue>
appears to be ConcurrentDictionary<TKey, TValue>
where TValue
derives from IEnumerable<Contact>
. It offers a superset of lookup functionality and is thread-safe if you build it correctly. There is no ready-to-use extension method for this in the Base Class Library, so you can just roll your own implementation:
IEnumerable<Contact> contacts = GetAllContacts();
ConcurrentDictionary<string, IReadOnlyList<Contact>> dict = new ConcurrentDictionary<string, IReadOnlyList<Contact>>();
foreach (IGrouping<string, Contact> group in contacts.GroupBy(c => c.COMPANY_ID))
{
if (!dict.TryAdd(group.Key, group.ToArray())) {
throw new InvalidOperationException("Key already added.");
}
}
This looks very similar to what others have offered, but with one important difference: my dictionary's TValue
is a materialised collection (specifically Contact[]
posing as IReadOnlyList<Contact>
). It does not get rebuilt from scratch every time you pull it out of the dictionary and enumerate it.
Oh, and also I only enumerate the source IEnumerable<Contact>
once, ever - not really life-changing, but a nice touch.
You could still use ConcurrentDictionary<string, IEnumerable<Contact>>
as your dictionary type (you could substitute the dictionary type in my example above and it will still compile and work as expected) - just be sure that you only add materialised and, preferably, immutable collections to the dictionary as you're building it.
Choosing your TValue
type: alternatives to IReadOnlyList<T>
(beyond the scope of the original question)
IReadOnlyList<T>
is the most generic general-purpose quasi-immutable collection interface I could think of (apart from IReadOnlyCollection<T>
obviously) that conveys to the callers that the collection is materialised and unlikely to change in the future.
If I were using this in my own code I would actually use Contact[]
as my dictionary's TValue
for any private and internal calls (forgoing the comfort of "read-only" for perf reasons). For any public APIs I would stick with IReadOnlyList<T>
or possibly ReadOnlyCollection<T>
to emphasise the read-only aspect of the TValue
collection.
If taking an external dependency is a viable option, you could also add Microsoft's System.Collections.Immutable
NuGet to your project and use an ImmutableDictionary<string, ImmutableArray<Contact>>
to store your lookup. ImmutableDictionary<TKey, TValue>
is an immutable thread-safe dictionary. ImmutableArray<T>
is a lightweight array wrapper which has strong immutability guarantees and also has solid performance characteristics achieved through a struct enumerator and re-implementations of certain LINQ methods which avoid enumerator allocations altogether.
List<T>
would be a poor choice for TValue
due to it's a) mutability and b) its tendency to allocate internal buffer arrays whose length is greater than List<T>.Count
(unless you explicitly use List<T>.TrimExcess
). When you stash stuff in a dictionary, there's a good chance it'll stay alive for a while, so allocating memory you're not really going to use (like List<T>
does) is not a great idea.
EDIT
Now after all this I have to add: .NET's current Lookup<Tkey, TValue>
implementation returned by LINQ's ToLookup
actually appears to be thread-safe. However, none of the specs that I found make any guarantees about the thread-safety of instance methods on Lookup<TKey, TValue>
(MSDN specifically states that they are not guaranteed to be thread-safe), which means that lookup thread-safety is an implementation detail, not a bulletproof guarantee. therefore all I said above re using ConcurrentDictionary<TKey, TValue>
still applies.