1

I was wondering how this could be achieved in the most efficient way.

Should I use

a.RemoveAll(x => b.AsParallel().Any(y => y == x));

or

a.AsParallel().Except(b.AsParallel());

or something else?

Can anyone explain what the underlying difference is? It seems to me, from measuring, that the second line is slower. What is the reason for this?

Igor Ševo
  • 5,459
  • 3
  • 35
  • 80
  • Why are you asking us to [race your horses](http://ericlippert.com/2012/12/17/performance-rant/)? (Test them out, find out which is faster, and if it still is not fast enough for your use case, *then* come back here with your example code and ask "this is too slow, can I make it faster?") – Scott Chamberlain Feb 27 '14 at 23:00
  • 2
    Is it still slower when you run in release mode with the debugger not attached? – Scott Chamberlain Feb 27 '14 at 23:07
  • The two queries are not equivalent. The first one operates on two lists, the second one on three (because you need a third list to store the results. Even if it is assigned to `a` there will be a temporary list in memory. – Gert Arnold Feb 27 '14 at 23:11

1 Answers1

4

Using the second option, with two ParallelQuery<T> operations, will perform the entire operation in parallel:

var results = a.AsParallel().Except(b.AsParallel());

The first option does a sequential check for the removal, and must build the ParallelQuery<T> for each iteration, which will likely be far slower.

Depending on the number of elements, however, it may actually be faster to run this without AsParallel:

var results = a.Except(b);

In many cases, the overhead of parallelizing for smaller collections outweighs the gains. The only way to know, in this case, would be to profile and measure the options involved.

It seems to me, from measuring, that the second line is slower. What is the reason for this?

This may be due to a lot of factors. First, make sure you're running outside of the VS host in a release build (this is a common issue). Otherwise, this may be due to the size of the collections, and data types involved.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373