0

i am trying to benchmark For and Parallel For for copying a list into an individual list

here Points are struct of int x , int y

Below is the benchmark code :

var points = addPoints();
int nbPoints = points.Count;


Measure("Normal Forloop", () =>
{
    List<int> x = new List<int>(nbPoints);
    List<int> y = new List<int>(nbPoints);
    for (int i = 0; i < nbPoints; i++)
    {
        x.Add(points[i].X);
        y.Add(points[i].Y);
    }
});

Measure("Parallel Forloop", () =>
{
    ConcurrentBag<int> x = new ConcurrentBag<int>();
    ConcurrentBag<int> y = new ConcurrentBag<int>();
    Parallel.For(0, nbPoints, i =>
    {
        x.Add(points[i].X);
        y.Add(points[i].Y);
    });
});

Now for a list size of NumberOfPoints = 1000000;

performance of normal for loop : 24 ms

performance of Parallel For loop : 367 ms

why did the parallel for loose so badly , is it because of concurrentbags ?

Ami Hollander
  • 2,435
  • 3
  • 29
  • 47
Keshav Raghav
  • 347
  • 1
  • 12
  • @MichaelRandall what is my mistake , sample size is same for both – Keshav Raghav Feb 05 '20 at 06:38
  • To get the best parallel performance, threads should not share resources, even if those resources are thread-safe (like `ConcurrentBag`). You'll get better perf if each thread adds to its own private `List<>` and then you merge the lists at the very end - but that would defeat the point of this benchmark. – Dai Feb 05 '20 at 06:38
  • 2
    You have measured List vs ConcurrentBag much more than for vs ForEach – H H Feb 05 '20 at 06:39
  • @dai , that is true but how do you explain a such a vast difference of performance – Keshav Raghav Feb 05 '20 at 06:39
  • 1
    @KeshavRaghav Because `List` and `ConcurrentBag` work on entirely different principles. – Dai Feb 05 '20 at 06:40
  • @HenkHolterman what other choice do i have when using paralle.for each i cannot use list inside a parallel for – Keshav Raghav Feb 05 '20 at 06:40
  • @KeshavRaghav Threads should not mutate (i.e. modify, write-to, etc) shared resources because when threads share resources it means they're reading and writing to the same memory locations, which means the CPU's cache system has to work inefficiently (remember each CPU core is its own full CPU) to ensure each core's state is synchronised. When multiple threads don't read/write to the same memory then the CPU's caches don't need to worry about keeping everything in-sync. – Dai Feb 05 '20 at 06:42
  • @KeshavRaghav 'No choice' doesn't make it a valid comparison. – H H Feb 05 '20 at 06:42
  • @HenkHolterman so you suggest i cannot parallise the above operation – Keshav Raghav Feb 05 '20 at 06:44
  • You are only adding anyway, why don't you use a pre-sized array. still this is not the best way to benchmark, you should be using a benchmarking library like Benchmark.net. There are just so many factors that can trip you up when trying to determine what horse is faster. Like debug and release mode, is the debugger attached, sample size, comparing apples to apples – TheGeneral Feb 05 '20 at 06:53
  • @MichaelRandall , how can i use array inside parallel.for , is it thread safe? or do you suggest me not to use TPL for this operation , my list will be of size 50k points , so i want to parallise it – Keshav Raghav Feb 05 '20 at 06:56
  • If it is a presized array, and you are only walking through the indexes and not accessing any particular element form multiple threads then it is thread safe – TheGeneral Feb 05 '20 at 06:58
  • @MichaelRandall even then parallel lost Normal Forloop Average time: 26.000ms Parallel Forloop Average time: 72.000ms – Keshav Raghav Feb 05 '20 at 07:02
  • Now try with 1000000 items – TheGeneral Feb 05 '20 at 07:28
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/207245/discussion-between-keshav-raghav-and-michael-randall). – Keshav Raghav Feb 05 '20 at 07:47

0 Answers0