2

I am having a difficult time reading an XML file with CDATA inside if I use a variable for the path to the element in XML. (NOTE: This is based on How to read CDATA in XML file with PowerShell? )

in $xmlsource file

<list>
  <topic>
    <SubTopic>
        <topicTitle>Test</topicTitle>
        <HtmlHead><![CDATA[<br>randomHTMLhere</br>]]></HtmlHead>
    </SubTopic>
    <SubTopic2>
        <topicTitle>Test2</topicTitle>
        <HtmlHead><![CDATA[<br>randomHTMLhere2</br>]]></HtmlHead>
    </SubTopic2>
  </topic>
</list>

In PowerShell

[String]$xmlsource = "C:\PowerShell_scripts\xmlsource.xml"
[xml]$XmlContent = get-content $xmlsource    

#These methods work but the Paths are HARD-CODED
Write-host "`r`nUsing HARD-CODED Paths"
$XmlContent.list.topic.SubTopic.HtmlHead.'#cdata-section'
$XmlContent.list.topic.SubTopic.HtmlHead.InnerText
$XmlContent.list.topic.SubTopic2.HtmlHead.InnerText

#But if the path is given in a variable, I get nothing.
Write-host "`r`nUsing `$pathToElement (returns blank line)"
[String]$pathToElement = 'list.topic.SubTopic.HtmlHead'
$XmlContent.$pathToElement.InnerText        #This return a blank line


#Insult to injury
#This kinda works but to parse the path to fit in the 'GetElementsByTagName' method would be clunky, inflexible and would still return the CDATA from *both* 'HtmlHead' elements.
Write-host "`r`nwith GetElementsByTagName(`$var)"
[String]$ElementName= 'HtmlHead'
$XmlContent.GetElementsByTagName($ElementName).'#cdata-section'
Write-host "`r`nwith GetElementsByTagName()"
$XmlContent.GetElementsByTagName('HtmlHead').'#cdata-section'

Does $pathToElement need to be cast as a special datatype?

NOTE: Xpath is a query language for XML So I corrected question above.

Mr. Annoyed
  • 541
  • 3
  • 18

1 Answers1

2
$XmlContent.list.topic.SubTopic.HtmlHead 

is looking up a property called list, then from that return value it looks up 'topic', then from that return value ... etc.

$XmlContent.$XpathToElement

is trying to lookup one single property named list.topic.SubTopic.HtmlHead and not finding it.

I don't think 'list.topic.SubTopic.HtmlHead' is the right form for an XPath expression. You could do:

$node = Select-Xml -xml $XmlContent -XPath '/list/topic/SubTopic/HtmlHead' | select -expand node
$node.InnerText

Edit: and do

Select-Xml -xml $xml -XPath '/list/topic//HtmlHead'

to get both HtmlHeads for SubTopic and SubTopic2.


Auto-generated PS help links from my codeblock (if available):

  • Select-Xml (in module Microsoft.PowerShell.Utility)
  • select is an alias for Select-Object (in module Microsoft.PowerShell.Utility)
TessellatingHeckler
  • 27,511
  • 4
  • 48
  • 87
  • Thank you. Looks like I did not use the right XML terminology. I'm trying to use a variable to address the CDATA in the XML. Since "XPATH" is a query language, is using xpath the only to do this even when I know EXACTLY where the CDATA is in the XML? – Mr. Annoyed Nov 02 '16 at 21:53
  • @Mr.Annoyed That's a bit like saying "*do I have to open all these nested folders to get to my file?*" and I say "*No, just write its path -> c:\users\annoyed\desktop\thing.txt*" and you say "*but I don't want to write a path to it, I know where it is, can Explorer press enter on the folders for me?*". XML is a nested tree, like a filesystem and XPath is the path you want to walk. The `property.property` syntax is a convenience syntax in some programming languages, not anything from XML. `$XmlContent.SelectNodes('/xpath/to/node')` or `$XmlContent.GetElementsByTagName('HtmlHead')` are related. – TessellatingHeckler Nov 03 '16 at 00:33