9

The more I use Parallel.ForEach and PLINQ in my code, the more faces and code review push backs I am getting. So I wonder is there any reason for me NOT to use PLINQ, at extreme, on each LINQ statement? Can the runtime not be smart enough to start spawning so many threads (or consuming so many threads from the thread pool) that the app performance would actually degrade instead of improve? The same question applies to Parallel library.

I do understand implications related to thread-safety and overhead of using multi-threading. I also realize not everything is good for parallelizing. All I am wondering about if I should stop defending my approaches and just give up on these two fine things because my peers think I'd better do thread control myself instead of relying on .NET facilities?

UPDATE: please assume the hardware is sufficiently good to satisfy prerequisites for use of multithreading.

Schultz9999
  • 8,717
  • 8
  • 48
  • 87

3 Answers3

4

It all comes down to two things:

  1. Is the extra work required to partition the collection and synchronize the threads greater than the performance gain compared to a regular foreach?

  2. Are all the threads going to use a shared resource that will become a bottle neck?

An example of the second case is doing a Parallel.ForEach over the results of a Linq to Sql statement. In that case, if your results are coming from the DB very slowly, each thread may spend more time waiting for data to process than actually doing something.

See: http://msdn.microsoft.com/en-us/library/dd997392.aspx

Diego
  • 18,035
  • 5
  • 62
  • 66
  • Say I have 1000 consumers that I need to apply some computational action (no db calls, no waiting for other resources), result of which I'd store in synchronized collection (that could be a penalty though). So I'd write Parallel.ForEach(consumers, c => ... ) with hopes it will be faster than simple foreach. I understand that Without actual performance profiling its hard to say if my hopes are justified. But from naive reasoning this approach seems right. – Schultz9999 Mar 25 '12 at 04:49
  • It seems right. But again, you would have to be doing quit a bit of computational work for each customer. A simple arithmetic calculation, for example, would not be enough to justify parallelism. – Diego Mar 25 '12 at 16:51
3

To set the number of worker threads you can use .WithDegreeOfParallelism(N)

eg

var query = from item in source.AsParallel().WithDegreeOfParallelism(2)
            where Compute(item) > 42
            select item;

See http://msdn.microsoft.com/en-us/library/dd997425.aspx

undefined
  • 33,537
  • 22
  • 129
  • 198
2

When dig into performance questions this deep, I think the best thing to do is... measure, measure and measure. Even if somebody answered that PLINK is great and will boost the performance of your application, would you trust that without verifing it with profiling? Although general answers may exists you cannot spare the effort to measure the performance in your exact case. The overall performance depends on so many things and it can be that PLINK helps in one case but not in the other.
My personal experiences with PLINK is that after swicthing every LINQ query into PLINK the response times are way better when the load is small, and there is no difference when the load is around its maximum. But I can imagine a case where PLINK hurts the overall performance under a huge load. Have to check it for your own particular case.
Well... and if you want to convince other people that you are walking the right path, what else would be better than measurement results?

Hari
  • 4,514
  • 2
  • 31
  • 33