-1

Why does the following code at random times write null values ​​to the DataRow object?

ConcurrentDictionary<long, DataRow> dRowDict = new ConcurrentDictionary<long, DataRow>();
foreach (long klucz in data.Keys)
{
    DataRow row = table.NewRow();
    dRowDict.TryAdd(klucz, row);
}
Parallel.ForEach(data.Keys, klucz =>
{
    Dane prz = data[klucz] as Dane;
    DataRow dr = dRowDict[klucz];
    foreach (PropertyDescriptor prop in llProp)
    {
        dr[prop.Name] = prop.GetValue(prz) ?? DBNull.Value;
    }
});
foreach (DataRow dRow in dRowDict.Values)
{
    table.Rows.Add(dRow);
    if (dRow["FILED"] == DBNull.Value)
    {
        MessageBox.Show("ERROR...NULL VALUE"); //why this happen?
    }
}

A plain for loop does not cause this problem - why, where is my bug?

My question is - can I modify the properties of any object inside a parallel loop?

Parallel.ForEach(myData.AsEnumerable()..., value =>
{
   Object x = new Object(); // or x = dict[key];
   x.A = ...;
   x.B = ...; //Can I do this and is it safe?
}
Red
  • 3,030
  • 3
  • 22
  • 39
pl.lepko
  • 1
  • 5
  • 2
    DataTable isn't thread-safe. The problem isn't `Parallel.ForEach`. Besides, `Parallel.ForEach` is meant for data parallelism, ie crunching lots of data. It does that by partitioning the input and having a dedicated worker process a partition. What your code does is nothing like it. It just modifies some DataRows. What are you trying to do, why would you need parallel processing? – Panagiotis Kanavos Sep 16 '20 at 11:59
  • 2
    [DataRow Class - Thread Safety](https://learn.microsoft.com/en-us/dotnet/api/system.data.datarow?view=netcore-3.1#thread-safety) – Damien_The_Unbeliever Sep 16 '20 at 11:59
  • `Parallel.Foreach` treats the list as unordered, so there is no any order it operates. – Roman Ryzhiy Sep 16 '20 at 12:00
  • 2
    @RomanRyzhiy that's not entirely correct - the data in a partition is ordered, just not intentionally. – Panagiotis Kanavos Sep 16 '20 at 12:02
  • 1
    What is this code trying to solve? Why create the rows in advance, instead of filling them as they get created? – Panagiotis Kanavos Sep 16 '20 at 12:04
  • How long does the code take to run without using `Parallel`? – mjwills Sep 16 '20 at 12:14
  • llProp - this list was a problem. – pl.lepko Sep 16 '20 at 14:02

1 Answers1

0

Modifying the properties of an object inside a parallel loop is only allowed if the object's class is thread-safe. In other word if the class was specifically designed to allow access to its members from multiple threads concurrently. Most classes are not designed this way, and instead they assume that will be accessed by one thread at a time.

From my experience thread-"unsafety" is considered the default, so if you are searching in the documentation of a specific class for a Thread Safety section, and you can't find it, you should assume that the class is not thread safe.

Accessing a non-thread-safe object from multiple threads concurrently results to "undefined behavior". This is a nice and formal synonym of "all kind of nastiness". It includes but it's not limited to random exceptions, corrupted state, lost updates, torn values, compromised security etc. These phenomena are non-deterministic in nature and often non-reproducible. So it is a strongly inadvisable practice.

The ADO.NET classes are thread-safe only for multithreaded read operations. Meaning that it is safe for multiple threads to read their properties concurrently. But when a thread is modifying an ADO.NET object, no other thread should access this object, either for reading or modifying it.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • I thought that the object that is created inside the Parallel.ForEach loop can only be modified by one thread. – pl.lepko Sep 17 '20 at 05:30
  • How many threads can modify a local object created inside a Paralllel.ForEach loop? – pl.lepko Sep 17 '20 at 05:32
  • Is thread safety needed for local variables that are formed inside a parallel loop? – pl.lepko Sep 17 '20 at 05:43
  • @pl.lepko if an object is created inside the lambda of the `Parallel.ForEach`, and it is not assigned to a variable in the outer scope (or stored in an external collection), then it will be accessed by one thread only, so it doesn't need to be thread-safe. – Theodor Zoulias Sep 17 '20 at 06:03
  • @pl.lepko in your case the problematic object is probably the `DataTable`. Although the rows created by the `table.NewRow()` are not attached to it according to the public API, they are probably still connected to it internally. So by updating the rows in parallel, the internal state of the `DataTable` is probably modified by multiple threads, hence the problem. – Theodor Zoulias Sep 17 '20 at 06:13
  • DataRow dr = dRowDict[klucz]; - this DataRow variable is created locally inside the parallel loop. Why do some people point out to me that the DataRow type is not thread safe. – pl.lepko Sep 17 '20 at 06:17
  • @pl.lepko nope, the line `DataRow dr = dRowDict[klucz];` does not create a new `DataRow`. It retrieves an already existing object, stored in the `dRowDict` dictionary. – Theodor Zoulias Sep 17 '20 at 06:19
  • DataRow dr = dRowDict[klucz]; - dRowDict is ConcurrentDictionary, and "DataRow dr" is local variable. – pl.lepko Sep 17 '20 at 06:24
  • thank you for explaining this issue - thanks to your comments, I felt safer. – pl.lepko Sep 17 '20 at 06:26
  • @pl.lepko you can't get away with thread-safety by creating local variables of already existing objects. It doesn't work this way unfortunately. If it was that easy, then multithreading would be a piece of cake, and not the tough nut it actually is. – Theodor Zoulias Sep 17 '20 at 06:29