I'm trying to get a distinct list of node names from an XML file. I've been successful with a recursive CTE similar to https://stackoverflow.com/a/2274091/1735928, but only with my files below about 1m characters. Above that, the query never seems to return. Some of my files are in the neighborhood of 100m characters.
I've since switched to trying PowerShell. For this samle XML:
<?xml version="1.0" encoding="UTF_8"?>
<root>
<childA>
<descendant1>
<descendant1_1>
<descendant1_1_1>1111111111</descendant1_1_1>
</descendant1_1>
</descendant1>
<descendant2>0</descendant2>
</childA>
<childA>
<descendant1>
<descendant1_1>
<descendant1_1_1>2222222222</descendant1_1_1>
</descendant1_1>
</descendant1>
<descendant2>2</descendant2>
</childA>
<childB>
<descendant1>
<descendant1_1>
<descendant1_1_1>2222222222</descendant1_1_1>
</descendant1_1>
</descendant1>
<descendant3>0</descendant3>
</childB>
<childC>
<descendant4>0</descendant4>
</childC>
<childC>
<descendant4>6</descendant4>
</childC>
</root>
I've gotten as far as:
$xml.childnodes[1].childnodes | select -uniq | foreach { $xml.childnodes[1].($_.name).childnodes.name | select -uniq }
which gives me:
descendant1
descendant2
descendant1
descendant3
descendant4
but that doesn't include the further descendants. Ultimately, I'm trying to get a table back to SQL that looks like this:
root | childA | descendant1
root | childA | descendant1_1
root | childA | descendant1_1_1
root | childA | descendant2
root | childB | descendant1
root | childB | descendant1_1
root | childB | descendant1_1_1
root | childB | descendant3
root | childC | descendant4