3

I've written a parallel algorithm in C# to partition an array into two lists, one that contains elements which satisfies a given predicate and the other list contains the elements that fails to satisfy the predicate. It is an order preserving algorithm.

I have written it as follows, but I want to know how to maximize the opportunity to profit from hardware concurrency.

    static void TestPLinqPartition(int cnt = 1000000)
    {
        Console.WriteLine("PLINQ Partition");
        var a = RandomSequenceOfValuesLessThan100(cnt).ToArray();
        var sw = new Stopwatch();
        sw.Start();
        var ap = a.AsParallel();
        List<int> partA = null;
        List<int> partB = null;
        Action actionA = () => { partA = (from x in ap where x < 25 select x).ToList(); };
        Action actionB = () => { partB = (from x in ap where !(x < 25) select x).ToList(); };
        Parallel.Invoke(actionA, actionB);
        sw.Stop();

        Console.WriteLine("Partion sizes = {0} and {1}", partA.Count, partB.Count);
        Console.WriteLine("Time elapsed = {0} msec", sw.ElapsedMilliseconds);
    }
Chris Gerken
  • 16,221
  • 6
  • 44
  • 59
cdiggins
  • 17,602
  • 7
  • 105
  • 102
  • Your better off asking here: http://codereview.stackexchange.com/ – asawyer Apr 06 '12 at 21:51
  • 3
    I hope that Beta dies in a fiery car crash. – cdiggins Apr 06 '12 at 22:20
  • Turned it into more of a question, so it doesn't sound like a code review. – cdiggins Apr 06 '12 at 22:27
  • Hi, your algorithm consist basically from copying data and computation of the predicate. Paralleled computation makes only sense if it's heavy enough to compensate synchronization overhead which occurs when bringing data back together which where split for parallel processing. Question is the predicate in a real program more complex and time consuming as just <25? Otherwise the most efficient way of doing this would be scan and enumerate elements form initial array. Do you really need two lists at the end or an enumerable will be enough? – George Mamaladze Apr 07 '12 at 20:01

2 Answers2

4

If your lists are very long you will not get much parallelism out of it (2x). Instead, I'd recommend using a Parallel.For and use a thread-local Tuple<List<int>, List<int>> as the parallel loop state. The Parallel.For API allows you to do this easily. You can merge the individual sublists at the end.

This version is embarrassingly parallel and causes almost no coherency traffic on the CPU-bus because there is no synchronization.

Edit: I want to emphasize that you cannot just use two List's shared by all threads because that is going to cause synchronization overhead like crazy. You need to use thread-local lists. Not even a ConcurrentQueue is suitable for this scenario because it uses Interlocked operations which cause CPU coherency traffic which is limited.

usr
  • 168,620
  • 35
  • 240
  • 369
  • What about x.AsParallel().Where() which is emebedded in the Parallel LINQ query. Isn't it parallelized to some degree? – cdiggins Apr 06 '12 at 22:19
  • It is fully parallel but you need to gather *two* lists. One for the matching and one for the not-matching items. – usr Apr 06 '12 at 22:20
  • Good answer, but I ended up using the other solution. – cdiggins Apr 08 '12 at 16:21
1

I'd partition the data into small segments (e.g. using the Partitioner class), and assign an index to each partition in relation to its position. For each numbered partition I'd create a Task that splits the partition into two groups, one that matches the predicate and one that doesn't and return the two groups, together with the index of the partition that they originated from, as the Task's return value. I'd then wait for all the tasks to complete, and then .Concat() (to prevent wasting time on actually merging all the data) the matching groups according to their index, and the same for the non-matching groups. You should be able to achieve an arbitrary degree of parallelism this way, while preserving relative item order.

Allon Guralnek
  • 15,813
  • 6
  • 60
  • 93