4

Imagine I have got the following Class for a student with lots of properties, lets simplify it to this:

public class Student{
    public Int64 id { get; set; }
    public String name { get; set; }
    public Int64 Age { get; set; }
}

Then on the main-thread I have got the following List:

List<Student> allStudents = new List<Student>();

Say I have 500 students in an Excel file and I want to gather them and insert them in the list. I can do something as follows:

for(int i = startRow; i < endRow; i++){
    Student s = new Student();

    //Perform a lot of actions to gather all the information standing on the row//

    allStudents.Add(s);
}

Now because gathering the information in Excel is very slow, because numerous actions have to be performed. So I want to use Parallel.For, I can imagine to do the following:

Parallel.For(startRow, endRow, i => {
    Student s = new Student();

    //Perform a lot of actions to gather all the information standing on the row//

    //Here comes my problem. I want to add it to the collection on the main-thread.
    allStudents.Add(s);
});

What is the correct way to implement Parallel.For in the matter described above? Should I lock the list before adding and how exactly?

@Edit 15:52 09-07-2015

The results of the answer below are as follows (524 records):

  • 2:09 minutes - Normal loop
  • 0:19 minutes - AsParallel loop
Revils
  • 1,478
  • 1
  • 14
  • 31

1 Answers1

9

I'd rather use PLinq instead of adding to List<T> (which is not thread safe):

List<Student> allStudents = Enumerable
  .Range(startRow, endRow - startRow)
  .AsParallel()
  .Select(i => new Student(...))
  .ToList();
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • Can u give a bit more details as in how this works and what the advantages are. (AsParallel, what does that mean?) – Revils Jul 09 '15 at 13:06
  • 2
    In few words `AsParallel` (look also at `AsOrdered` and `AsSequential`) allows *PLinq* run the methods below (i.e. `Select`) in parallel mode: say, run `Select` with `i == 5` not waiting for completeness of `Select` with `i == 4` https://msdn.microsoft.com/en-us/library/dd460688(v=vs.100).aspx – Dmitry Bychenko Jul 09 '15 at 13:16
  • 2
    The parameters for `Enumerable.Range()` indicate (start, count) instead of (start, end) as in `Parallel.For()` (weird C# design decision), so you need `Enumerable.Range(startRow, endRow - startRow)`. Other than that, excellent answer. – Dennis_E Jul 09 '15 at 13:22
  • @Dennis_E: Thank you! You're right: it must be `endRow - startRow` – Dmitry Bychenko Jul 09 '15 at 13:24
  • oops i didnt see the last line in question. – M.kazem Akhgary Jul 09 '15 at 13:28
  • Just for the record, with a normal and your suggested parallel loop it took respectively 2:09 and 0:19 minutes! (for 524 records) So I have to say that it works like a charm, Thanks! @Dennis_E Thank you, I was indeed getting to much rows, because I was using the endrow and not the count of rows. – Revils Jul 09 '15 at 13:52