1

I have a self-referenced table like below.

key parent Description
A NULL
B A
C B
D C

And initially having 1 Key value. Using this key value, I need to find all the child nodes recursively and update the description field.

This table contains around 27 thousand records with multiple levels of hierarchy.

I'm trying with the below code to update the description.

Private Sub ChangeDescriptionChildRows(ByRef row As DataRow,
                                       ByRef table As DataTable)
    UpdateDescription(row) 'Already having Root node

    For Each childRow As DataRow In table.Select("parent=" & row("key"))
        UpdateDescription(childRow, table) 'Recursion
    Next
End Sub

Private Sub UpdateDescription(ByRef row As System.Data.DataRow)
    Dim description As String = ""
    If row("key") = "A" Then
        description = "This is Root Node"
    ElseIf row("Key") = "B" Then
        description = "Something"
    End If
    row("description") = description
End Sub

It's taking a lot of time to update the table.

Is there any better way to update the table using plinq/parallel.foreach ?

Yashas
  • 31
  • 6
  • OT, that code wouldn't compile because you've declared `table` as type `DataSet` rather than `DataTable`. There's also no reason to declare those parameters `ByRef`. You could also do away with the `table` parameter altogether, given that a `DataRow` has a `Table` property. Getting that property value every time might impact performance though. Not sure whether it would be significant or not. You could test that. – jmcilhinney Nov 15 '22 at 11:28
  • `Parallel.ForEach` should work in theory but, if you were to do that recursively with a large table, I'm not sure that you wouldn't saturate the thread pool. You could certainly try it and see. Converting a `For Each` loop to a `Parallel.ForEach` call is simple enough. – jmcilhinney Nov 15 '22 at 11:31

1 Answers1

0

Your code is not parallelizable, because all you are doing is reading and updating a DataTable, and DataTables are not thread-safe for multithreaded read and write operations. They are only safe for multithreaded read operations. So both the PLINQ and the Parallel.ForEach are not valid solutions for this problem. My suggestion is to try and optimize the single-thread execution of your code, for example by using an indexed DataView, or by creating a lookup with the ToLookup LINQ operator. There is much greater potential for speeding up your code algorithmically, than by throwing more CPU cores to the problem. Parallelization is hard, and might get you a boost of x3 or x4 if you are lucky, while algorithmic optimization of an unoptimized piece of code might get you easily a boost of x100 or x1000.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104