0

I am filling datatable in dataset by iterating through collection and sorting it:

foreach (var item in items)
{
    DataRow newRow = dataTable.NewRow();
    contractRow["column1"] = item.Value1;
    contractRow["column2"] = item.Value2;
    dataTable.Rows.Add(newRow);
}
dataTable.DefaultView.Sort = "column1 asc";

Everything works fine.

But when I'm trying to generate datatable in parallel:

Parallel.ForEach(items, (item) =>
{
    DataRow newRow = dataTable.NewRow();
    contractRow["column1"] = item.Value1;
    contractRow["column2"] = item.Value2;
    lock (sync)
    {
        dataTable.Rows.Add(newRow);
    }
});
dataTable.DefaultView.Sort = "column1 asc";

sorting fails with

ArgumentException: An item with the same key has already been added.

Any idea why this happens?

Pavel S.
  • 21
  • 4
  • how big is this datatable? *why* is it a datatable? there may be better sorting options available, but very few of them will apply to data-tables – Marc Gravell Mar 26 '18 at 15:29
  • The datatable has about 30 columns and up to 50 000 rows, depending on data required, and it grows by another thousand or two rows every month. The program obtains the data from numerous files, makes some calculations and exports the results to excel. Long time ago dataset seemed to be comfortable way to do it, but now gathering the data required takes too long. – Pavel S. Mar 26 '18 at 19:41
  • I hate to say it, but I think the first thing you need to do is: stop using data table. Once you've done that, there are *lots* of things you can do to make sorting incredibly fast - I've blogged about this extensively recently, for example – Marc Gravell Mar 27 '18 at 08:23

2 Answers2

0

You need to lock the entire section where DataTable is altered as they are not thread-safe for writing, but that defeats the purpose of having Parallel.ForEach in the first place.

lock (sync)
{
    DataRow newRow = dataTable.NewRow();
    contractRow["column1"] = item.Value1;
    contractRow["column2"] = item.Value2;
    dataTable.Rows.Add(newRow);
}

More info about thread safety for DataTable can be found here: Thread safety for DataTable

Vidmantas Blazevicius
  • 4,652
  • 2
  • 11
  • 30
0

The problem is that like a DataTable, a DataRow is only thread-safe for read operations. You have to sync for write operations. Assigning values to the ItemArray is a write operation, and is not thread-safe. You're likely getting two rows that have the same values for the table primary key, thus the error.

CDove
  • 1,940
  • 10
  • 19