I've written a parallel algorithm in C# to partition an array into two lists, one that contains elements which satisfies a given predicate and the other list contains the elements that fails to satisfy the predicate. It is an order preserving algorithm.
I have written it as follows, but I want to know how to maximize the opportunity to profit from hardware concurrency.
static void TestPLinqPartition(int cnt = 1000000)
{
Console.WriteLine("PLINQ Partition");
var a = RandomSequenceOfValuesLessThan100(cnt).ToArray();
var sw = new Stopwatch();
sw.Start();
var ap = a.AsParallel();
List<int> partA = null;
List<int> partB = null;
Action actionA = () => { partA = (from x in ap where x < 25 select x).ToList(); };
Action actionB = () => { partB = (from x in ap where !(x < 25) select x).ToList(); };
Parallel.Invoke(actionA, actionB);
sw.Stop();
Console.WriteLine("Partion sizes = {0} and {1}", partA.Count, partB.Count);
Console.WriteLine("Time elapsed = {0} msec", sw.ElapsedMilliseconds);
}