2

I need to process the single file in parallel by sending skip-take count like 1-1000, 1001-2000,2001-3000 etc

Code for parallel process

var line = File.ReadAllLines("D:\\OUTPUT.CSV").Length;
Parallel.For(1, line, new ParallelOptions { MaxDegreeOfParallelism = 10 }, x              
=> {
  DoSomething(skip,take);
});

Function

public static void DoSomething(int skip, int take)
{
     //code here
}

How can send the skip and take count in parallel process as per my requirement ?

Mahendran
  • 468
  • 8
  • 19

1 Answers1

1

You can do these rather easily with PLINQ. If you want batches of 1000, you can do:

const int BatchSize = 1000;

var pageAmount = (int) Math.Ceiling(((float)lines / BatchSize));
var results = Enumerable.Range(0, pageAmount)
                        .AsParallel()
                        .Select(page => DoSomething(page));

public void DoSomething(int page)
{
    var currentLines = source.Skip(page * BatchSize).Take(BatchSize);
    // do something with the selected lines
}
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
  • Thanks, but `Enumerable.Range(1, pageAmount)` should be `Enumerable.Range(0, pageAmount-1)` right ? – Mahendran Jul 07 '15 at 09:20
  • in method `DoSomething` we get only page= 1 for first time right? so `source.Skip(page * BatchSize).Take(BatchSize)` will be become source.Skip(1000).Take(1000). Hence first time itself it will skip 1000 lines – Mahendran Jul 07 '15 at 09:24
  • Thanks, i think need to change end range , i.e. `pageAmount` to `pageAmount-1` – Mahendran Jul 07 '15 at 09:31
  • we are skipping 4000 and taking 1000 , this means we are take the last batch i.e 4001-5000 , so it should `pageAmount-1` – Mahendran Jul 07 '15 at 12:39
  • i cannot understand your point , no of lines=5000, then pageAmount = 5 , it iterate from 0 , 5-1 means, totally 5 iteration i.e 0,1,2,3,4 so line process on each iteration are 1-1000,1001-2000,2001-3000,3001-4000,4001-5000 so all the line are processed , correct me if i am wrong. – Mahendran Jul 07 '15 at 12:46
  • @Mahendran If you do `Enumerable.Range(0, 4)` you'll get `0,1,2,3`. Then, you'll be missing a batch. The second parameter is `Count`, meaning, how many sequential integers to generate. – Yuval Itzchakov Jul 07 '15 at 12:49
  • Thanks for clarification – Mahendran Jul 07 '15 at 13:38
  • Isn't pageAmount rounded down due to integer division? You might miss out on the lines that do not fill a full page. – Emond Jul 07 '15 at 15:22
  • @Erno Addressed that issue using `Math.Cieling` – Yuval Itzchakov Jul 07 '15 at 15:43