I'm trying to find the fastest and most performant way to select a subset of items from a list based on a key property and assign this subset (list) to the property of an item in another list. The performance side of this is important since this part of code is going to be invoked very often each workday. I measured the performance in ticks to clearly see the relative difference.
I've got two lists (example setup);
List<CategorySetting> catList;
List<Customer> custList;
The CategorySetting
entity has a property called SettingsId
. The Customer
entity has a SettingsId
property as well, which is in fact a foreign key from Customers
to CategorySetting
.
The first piece of code i wrote was the most straight forward;
// normal for each: 13275 ticks
foreach (var catItem in catList)
{
catItem.Customers = custList.Where(c => c.SettingsId == catItem.SettingsId).ToList();
}
This would take about 13275 ticks.
I then thought maybe using parallelism this could be a lot faster? So I wrote this piece of code;
// parallel for each: 82541 ticks
Parallel.ForEach(catList, catItem =>
{
catItem.Customers = custList.Where(c => c.SettingsId == catItem.SettingsId).ToList();
});
This took way longer; 82541 ticks. That made no sense to me because of the parallel nature of this approach. It should use multiple threads to do this so in theory should be much faster. Then I started wondering what would happen if the multiple threads would try to access the customerlist at the same time. That might result in locks and queues hence taking more time because of the overhead? The same as for the writing to the main list.
I tried another approach. I made a ConcurrentBag
for the catList (main list).
ConcurrentBag<CategorySettings> csBag = new ConcurrentBag<CategorySettings>(catList);
The custList I punt into a ConcurrentDictionary
already grouped by the SettingsId
.
var dict = custList.GroupBy(c => c.SettingsId).ToDictionary(x => x.Key, y => y.ToList());
ConcurrentDictionary<int?, List<Customer>> concDict = new ConcurrentDictionary<int?, List<Customer>>(dict);
The final try was then like this:
// paralell, bag, concurrent dictionary: 40255
Parallel.ForEach(caBag, ca =>
{
concDict.TryGetValue(ca.SettingsId, out var selCust);
ca.Customers = selCust;
});
This would take 40255 ticks. Can anyone explain why this is still taking longer? And more important is there no other way then 'just' a foreach loop? Feels like i'm missing something here.
Any ideas are greatly appreciated!