0

I have a class that returns a List<T> of transactions. I need to query that list with Linq (or equivalent) inside a Parallel.ForEach loop, since I need a subset of those transactions for each cycle. The subset contains transactions of a specific customer.

So my case is that the List<T> contains all transactions and I want to parallelize the elaboration per customer. The subset List<T> of customer's transactions then it is passed as parameter to another class method to do the elaboration.

Each cicle will only need to READ from the main List<T> of transactions and will take always different subset of transactions, since they're grouped by customer id.

The ServiceItems and TransactionProcessor classes are on .NET 3.5... otherwise I could use ConcurrentBag<T> directly on them. The ProcessTransactions method instead, is inside a .NET 4.5 project.

I want to know if my approach can be considered thread safe, maybe just because the transactions are accessed in read-only mode. Or otherwise if it is not, how I can make it works right.

Sample code:

public void ProcessTransactions
{
    ServiceItems items = new ServiceItems();

    // Call the method that retrieves the transactions for all customers
    items.Initialize(); 

    ConcurrentBag<ItemsGroup<TransactionItem>> itemsBag
        = new ConcurrentBag<ItemsGroup<TransactionItem>>(items.ItemsGroups);

    ConcurrentDictionary<int, string> customers = GetCustomers(
        items.CustomersIds.Select(id => id.ToString()).ToList());

    Parallel.ForEach(customers, options, (customer, state) =>
    {
        int customerId = customer.Key;

        // Is this Thread Safe?
        List<ItemsGroup<TransactionItem>> customerItems = itemsBag
            .Where(g => g.CustomerId == customerId).ToList();

        // This is the transactions processor class
        TransactionsProcessor processor = new TransactionsProcessor();

        // This method accepts a List<ItemsGroup<TransactionItem>> object
        // for customer transactions
        processor.StartForCustomer(customerId, customerItems);
    }
}

UPDATE

Thanks to @Evk.

This is a similar question that clarify the usage of List<T> in a multi thread environment. And see also this about Linq on List<T>.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Cheshire Cat
  • 1,941
  • 6
  • 36
  • 69
  • 1
    You don't need any concurrent collections here, since you are only reading from them. – Evk Nov 27 '17 at 17:49
  • So you're saying that if I only need to read I can always use a normal List? Maybe I'm wrong but I seem to remember to have read that this is not so... Anyway, what if I also need to write? – Cheshire Cat Nov 27 '17 at 18:08
  • 2
    Yes, reading collection from multiple threads is fine as long as you never write. If you write (that is adding or removing items) then its not safe and you need locks or concurrent collections. This applies generally - all immutable (read only) objects are thread safe. – Evk Nov 27 '17 at 18:31
  • 2
    Just to enforce the immutability you could use the classes from the [`System.Collections.Immutable`](https://msdn.microsoft.com/en-us/library/system.collections.immutable.aspx) namespace or at least return your lists as [`IReadOnlyList`](https://msdn.microsoft.com/en-us/library/hh192385.aspx) interface. – Oliver Nov 28 '17 at 08:17
  • Another case... What if the customer's subset of transactions (`customerItems`) used by the processor method (`StartForCustomer(customerId, customerItems)`) is changed inside the `StartForCustomer` method? Like a cicle: `foreach (Transaction in transactions) { transaction.Processed = true; transaction.UpdateDate = DateTime.Now; }`. Is there any problem using `List`? – Cheshire Cat Nov 29 '17 at 09:27

0 Answers0