3

I'm trying to build a script to convert a XML file to CSV. I started following this guide https://learn.microsoft.com/en-us/answers/questions/542481/parse-xml-to-csv-help.html but I'm stuck with the following situation now:

The first part of my code works fine and it's writing to the CSV file the first column, but as it gets through the following iterations, it doesn't write the values to the file even though it outputs them to the console. This is my code:

[xml]$x = Get-Content C:\Users\Desktop\Policy.xml

$x.Profiles |
    ForEach-Object{
            $_.Profile |
                ForEach-Object{
                    $_.Ruleset |
                        ForEach-Object{
                            Write-Host "Policy ID = " $_.ID
                            [PSCustomObject]@{
                                policyName = $_.ID
                            }
                            $_.Conditions |
                                ForEach-Object{
                                    $_.condition |
                                        ForEach-Object{
                                            if($_.name -eq "Resource") {
                                                Write-Host "Resource = " $_.'#text'
                                                [PSCustomObject]@{
                                                    resource = $_.'#text'
                                                }
                                            }
                                            if($_.name -eq "Action") {
                                                Write-Host "Action = " $_.'#text'
                                                [PSCustomObject]@{
                                                    action = $_.'#text'
                                                }
                                            }
                                            if($_.name -eq "actor") {
                                                Write-Host "actor = " $_.'#text'
                                                [PSCustomObject]@{
                                                    actor = $_.'#text'
                                                }
                                            }
                                            if($_.name -eq "child") {
                                                Write-Host "child = " $_.'#text'
                                                [PSCustomObject]@{
                                                    child = $_.'#text'
                                                }
                                            }
                                            if($_.name -eq "number") {
                                                Write-Host "number = " $_.'#text'
                                                [PSCustomObject]@{
                                                    number = $_.'#text'
                                                }
                                            }
                                        }
                                }
                        }
                }
    } | Export-Csv C:\Users\Desktop\policy.csv -NoTypeInformation

So, until the first [PSCustomObject] (line 10) it works fine and the policyName column is written to the CSV value with its corresponding value. But in the second [PSCustomObject] (line 19) where it should write the Resource/Action/actor/child/number, it does not write to the file anymore.

What's the right way to add those values to the already existing [PSCustomObject]?

For reference:

XML snippet

<?xml version="1.0" encoding="UTF-8"?>
<Profiles xmlns:pcrf="nothing interesting here">
    <Profile Desc="some description" ID="someid" Prio="0">
        <Ruleset Class="class1" Desc="" ID="policyid1" Prio="10" active="true">
            <Conditions>
                <condition name="Resource" type="matchExact">resource1</condition>
                <condition name="Action" type="matchExact">action1</condition>
                <condition name="actor" type="matchExact">actor1</condition>
            </Conditions>
        </Ruleset>
        <Ruleset Class="classX" Desc="" ID="policyidX" Prio="10" active="true">
            <Conditions>
                <condition name="Resource" type="matchExact">resource4</condition>
                <condition name="Action" type="matchExact">action4</condition>
                <condition name="child" type="matchExact">child1,child2</condition>
            </Conditions>
        </Ruleset>
    </Profile>
    <Profile Desc="some description" ID="someid" Prio="0">
        <Ruleset Class="classY" Desc="" ID="policyidY" Prio="10" active="true">
            <Conditions>
                <condition name="Resource" type="matchExact">resource99</condition>
                <condition name="Action" type="matchExact">action00</condition>
                <condition name="child" type="matchExact">child5</condition>
                <condition name="number" type="matchExact">number1</condition>
            </Conditions>
        </Ruleset>
    </Profile>
</Profiles>

I'm getting this CSV as result:

"policyName"
"policyid1"



This is the powershell output:

PS C:\Users\Desktop> .\xmltocsv.ps1
Policy ID =  policyid1
Resource =  resource1
Action =  action1
actor =  actor1
Resource =  resource4
Action =  action4
child =  child1,child2
Resource =  resource99
Action =  action00
child = child5
number =  number1

This is what I expect to get as a CSV file:

"policyName","Resource","Action","actor","child","number"
"policyid1","resource1","action1","actor1","",""
"policyidX","resource4","action4","","child1,child2",""
"policyidY","resource99","action0","","child5","number1"
Adami
  • 335
  • 5
  • 15
  • hi, interesting might be of interest https://community.spiceworks.com/topic/1567146-adding-rows-to-a-psobject – jspcal Nov 10 '22 at 20:03

3 Answers3

2

The primary problem with your code is that you need to create one output object per <condition> element, with property values gleaned from the parent element and its child elements, whereas you're creating a separate object ([pscustomobject]) for what should be a property of each output object; here's a greatly streamlined solution:

$x.Profiles.Profile.Ruleset.Conditions |
  ForEach-Object {
    $outputObj = [pscustomobject] @{
      # Get the policy ID from the parent node.
      policyId = $_.ParentNode.ID
      # The following properties are filled in below.
      Resource = ''
      Action = ''
      actor = ''
      child = ''
      number = ''
    }
    # Loop over all child elements and fill in the property
    # that corresponds to the "name" attribute with the element's inner text.
    # Note that the "name" *attribute* actually shadows
    # the element's type-native .Name *property*.
    foreach ($el in $_.ChildNodes) {
      $outputObj.($el.name) = $el.InnerText
    }
    # Output the resulting object
    $outputObj
  } | 
  ConvertTo-Csv  # To export to a *file, use Export-Csv

This yields the desired output as shown in the question.

Note:

  • This creates one [pscustomobject] per <Conditions> element, with the property values gleaned from the parent element as well as a fixed set of properties, which are filled if present among the child elements based on a matching name attribute value, using a matching child element's inner text as the property value.

  • Note that property access .Profiles.Profile.Ruleset.Conditions implicitly loops over multiple <Profile> and <Ruleset> elements and returns them all, courtesy of a feature called member-access enumeration.

mklement0
  • 382,024
  • 64
  • 607
  • 775
2

Instead of creating one [PSCustomObject] for each column, you have to create a single [PSCustomObject] for the entire row. You can do so by first creating an ordered hashtable [ordered] @{} and convert it to [PSCustomObject] when you have finished assigning all column values.

To differentiate from other helpful answers, this one uses Select-Xml for streamlined code. Select-Xml outputs SelectXmlInfo objects, whose Node member contains the DOM elements, similar to what you get from reading into [xml].

Select-Xml -Path Policy.xml -XPath '/Profiles/Profile/Ruleset/Conditions' | ForEach-Object {
    
    # Create initial ordered hashtable
    $row = [ordered] @{
        policyName = $_.Node.ParentNode.ID
    }

    foreach( $name in 'Resource', 'Action', 'actor', 'child', 'number' ) {
        # Add a property to the hashtable.
        # The value is a SelectXmlInfo object, which Export-Csv conveniently converts to string.
        $row.$name = $_.Node | Select-Xml -XPath "condition[@name='$name']/text()"
    }

    # Convert to PSCustomObject and output
    [PSCustomObject] $row
       
} | Export-Csv policy.csv -NoTypeInformation

This outputs the following CSV for your example XML:

"policyName","Resource","Action","actor","child","number"
"policyid1","resource1","action1","actor1","",""
"policyidX","resource4","action4","","child1,child2",""  
"policyidY","resource99","action00","","child5","number1"
zett42
  • 25,437
  • 3
  • 35
  • 72
  • Same as the other answer. That only didn't work because of a mistake I made. The XML have different ```Conditions``` blocks, one block may have ```Resource```, ```Action``` and ```actor``` while other block may have ```Resource```, ```Action``` and ```child```. I've updated the expected output in the original post. The solution proposed is creating a header with only the information from the first ```Conditions``` block and suppressing different values for the next blocks. – Adami Nov 10 '22 at 21:30
  • 1
    @Adami I have given it another try with completely new code, please have a look. – zett42 Nov 10 '22 at 21:57
2

Put into table and then save table to CSV

using assembly System 
using assembly System.Xml.Linq 

$Filename = "c:\temp\test.xml"
$xDoc = [System.Xml.Linq.XDocument]::Load($Filename)

$ruleSets =  $xDoc.Descendants("Ruleset").Foreach([System.Xml.Linq.XElement])
$table = [System.Collections.ArrayList]::new()
foreach($ruleSet in $ruleSets)
{
   $newRow = New-Object -TypeName psobject
   $id = $ruleSet.Attribute("ID").Value
   $newRow | Add-Member -NotePropertyName policyName -NotePropertyValue $id
   $conditions = $ruleSet.Descendants("condition").Foreach([System.Xml.Linq.XElement])
   foreach($condition in $conditions)
   {
      $name = $condition.Attribute("name").Value
      $value = $condition.Value 
      $newRow | Add-Member -NotePropertyName $name -NotePropertyValue $value
   }
   $table.Add($newRow) | Out-Null
}
$table | format-Table
jdweng
  • 33,250
  • 2
  • 15
  • 20
  • As an aside: `using assembly System` is never needed in PowerShell - all types in the `System` namespaces are accessible by default. Conversely, `using assembly` only _loads_ the specified assembly, without also allowing name-only access to the types in its namespace; for the latter, you (also) need `using namespace`; that is, adding `using namespace System.Xml.Linq` would allow you to simplify `[System.Xml.Linq.XDocument]` and `[System.Xml.Linq.XElement]` to `[XDocument]` and `[XElement]`, respectively. – mklement0 Nov 11 '22 at 03:15
  • 1
    Naming a variable that contains a `[System.Collections.ArrayList]` instance `$table` could lead to confusion. In general, there's rarely a need to _iteratively_ build an array-like (list) structure explicitly, given that you can simply use entire statements _as expressions_, with PowerShell collecting all outputs _automatically_ in an array; e.g. `$arr = foreach ($i in 1..2) { 1 + $i }` – mklement0 Nov 11 '22 at 03:19
  • @mklement0 : What happens if you do not know the size of the array. Many apps you read a file and parse and do not know the size until parsing is completed. – jdweng Nov 11 '22 at 06:11
  • PowerShell takes care of that behind the scenes. I think it uses an `ArrayList` instance behind the scenes to efficiently grow the collection of output objects as they're being produced, and then converts it to an `[object[]]` array. – mklement0 Nov 11 '22 at 11:01
  • You can't declare a fixed object if you do not know the size. My code uses a list which can grow, you code cannot grow. – jdweng Nov 11 '22 at 11:49
  • 1
    Remember the example: `$arr = foreach ($i in 1..2) { 1 + $i }` Nothing needs to grow: PowerShell has done the growing and collecting for you, and you simply get the result.The same technique can be used to simplify (and speed up) your code.`$table = foreach (...) { <# ... #> $newRow <# <- output that is collected #> }` – mklement0 Nov 11 '22 at 12:45
  • You do not know the size when you declare the array before the loop. The size depends on the data inside the loop that you read. Your method cannot work. – jdweng Nov 11 '22 at 14:22
  • There is no array to declare. Please try the example, and if you find that it doesn't work, we can discuss further. You can also have a look at https://stackoverflow.com/a/59278604/45375 – mklement0 Nov 11 '22 at 14:26
  • My code works. Been using it for a long time. – jdweng Nov 11 '22 at 15:00
  • My point isn't about whether the code _works_ (it does, and I never claimed otherwise). I was trying to show you - and potentially future readers - a simpler (and more efficient) alternative that is more PowerShell-idiomatic. – mklement0 Nov 11 '22 at 15:02