6

I have the following PLINQ query:

// Let's get a few customers
List<Customer> customers = CustomerRepository.GetSomeCustomers();

// Let's get all of the items for all of these customers
List<CustomerItem> items = customers
    .AsParallel()
    .SelectMany(x => ItemRepository.GetItemsByCustomer(x))
    .ToList();

I would expect GetItemsByCustomer() to be executed in parallel for each customer, but it runs sequentially.

I have tried to force parallelism but still without luck:

List<CustomerItem> items = customers
    .AsParallel()
    .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
    .SelectMany(x => ItemRepository.GetItemsByCustomer(x))
    .ToList();

The method signature:

private IEnumerable<Item> GetItemsByCustomer(Customer customer)
{
    // Get all items for a customer...
}

According to this article, PLINQ can certainly take the sequential route if it deems fit, but forcing parallelism should still override this.

Note: This above example is purely illustrative - assume customers to be a small list and GetItemsByCustomer to be an expensive method.

Dave New
  • 38,496
  • 59
  • 215
  • 394
  • 4
    Can you give a complete, self-contained example? – nvoigt Oct 06 '14 at 06:26
  • Is your real code same or you're using `SelectMany`'s overload which takes an index as parameter? – Sriram Sakthivel Oct 06 '14 at 07:00
  • @SriramSakthivel: My code is structurally identical to the above. – Dave New Oct 06 '14 at 07:05
  • How many cores does your environment have? `Scheduling.DefaultDegreeOfParallelism` defaults to `Math.Min(Environment.ProcessorCount, 512)`. – Caramiriel Oct 06 '14 at 07:07
  • @nvoigt: Sorry if I haven't been clear. What exactly do you need? – Dave New Oct 06 '14 at 07:07
  • @davenewza He's asking for a small but complete program to reproduce the problem. – Sriram Sakthivel Oct 06 '14 at 07:10
  • I would like a small, self-contained example, so I can run it on my system. Your code has so many uncertain parts that it's impossible to give you more than a guess. And that is not what this platform is about. For example, maybe your repository is locking and disabling parallel access? We cannot know. You do. Reduce your example to something we can run, maybe you will find the error along the way. – nvoigt Oct 06 '14 at 08:19
  • 2
    In addition you should also specify how you measured your execution to conclude it's not running in parallel? – Faris Zacina Oct 06 '14 at 10:22
  • 2
    Don't try to use parallelism to speed up slow data access code! It looks like your code tries to execute queries "in parallel" over an ORM's context which typically uses a *single* connection. This will force all queries to execute sequentially. Instead of trying to execute multiple queries, use your ORM's mechanisms to either batch all request to a single one or create a `GetItemsByCustomers` method that accepts a list of all IDs you want to use and uses an `IN (...)` argument in the WHERE clause – Panagiotis Kanavos Oct 06 '14 at 12:55
  • What ORM are you using? Why are you trying to execute multiple queries in parallel? – Panagiotis Kanavos Oct 06 '14 at 12:56
  • Is `ItemRepository` thread-safe? – Gert Arnold Oct 07 '14 at 07:44

1 Answers1

7

There is nothing wrong with AsParallel(). It will run as parallel if possible, and there is no sequential dependency in your LINQ expression, so there is nothing to force it to run sequentially.

A couple of reasons why your code doesn't run in parallel could be:

  1. Your box/vm has a single CPU or you have a .NET setting to limit the parallelism to one CPU. You can Simulate that with this code:

          var customers = new List<Customer>() { new Customer() {Name = "Mick", Surname = "Jagger"}, new Customer() {Name = "George", Surname = "Clooney"},new Customer() {Name = "Kirk", Surname = "DOuglas"}};
    
          var items = customers
            .AsParallel()
            .SelectMany(x =>
            {
                 Console.WriteLine("Requesting: " + x.Name + " - " + DateTime.Now);
                 Thread.Sleep(3000);
                 return new List<CustomerItem>();
    
            })
            .WithDegreeOfParallelism(1)
            .ToList();
    

    Even if you force paralelism with WithExecutionMode(ParallelExecutionMode.ForceParallelism) on a single core/CPU box or when the degree of parallelism is 1, your setting will not have effect, since true parallelism is not possible.

  2. There is some thread locking on shared resources happening in your repository. You can simulate thread locking with the following code:

        var customers = new List<Customer>() { new Customer() {Name = "Mick", Surname = "Jagger"}, new Customer() {Name = "George", Surname = "Clooney"},new Customer() {Name = "Kirk", Surname = "DOuglas"}};
    
        var locker = new object();
    
        // Let's get all of the items for all of these customers
        var items = customers
            .AsParallel()
            .SelectMany(x =>
            {
                lock (locker)
                {
                    Console.WriteLine("Requesting: " + x.Name + " - " + DateTime.Now);
                    Thread.Sleep(3000);
                    return new List<CustomerItem>();
                }
    
            })
            .ToList();
    
  3. There is some Database setting that is forcing the queries/reads to be sequential under certain circumstances, and that could give you an impression that your C# code is not running in parallel, while it actually is.

svick
  • 236,525
  • 50
  • 385
  • 514
Faris Zacina
  • 14,056
  • 7
  • 62
  • 75
  • 2
    So, it came down to a custom method attribute (for caching), which was performing a lock. I completely overlooked the attribute - but your post led me to it. Thanks – Dave New Oct 08 '14 at 13:22