6

I need to select the text in a node, but not any child nodes. the xml looks like this

<a>
  apples  
  <b><c/></b>
  pears
</a>

If I select a/text(), all I get is "apples". How would I retreive "apples pears" while omitting <b><c/></b>

Andre Lombaard
  • 6,985
  • 13
  • 55
  • 96
Dave
  • 63
  • 1
  • 3

3 Answers3

4

Well the path a/text() selects all text child nodes of the a element so the path is correct in my view. Only if you use that path with e.g. XSLT 1.0 and <xsl:value-of select="a/text()"/> it will output the string value of the first selected node. In XPath 2.0 and XQuery 1.0: string-join(a/text()/normalize-space(), ' ') yields the string apples pears so maybe that helps for your problem. If not then consider to explain in which context you use XPath or XQuery so that a/text() only returns the (string?) value of the first selected node.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • I'm using it in a php script and my version of XSLT is definately 1.0. – Dave Mar 03 '11 at 15:50
  • +1 Correct explanation. XPath `string()` function also would cast the node set to a singleton. –  Mar 03 '11 at 16:32
  • Dave, even with XPath 1.0 the path expression `a/text()` yields a node-set with two text nodes. If you only get one result then you might want to show us the PHP code you have that executes the XPath query so that we can try to identify why you get only one result. If you use http://www.php.net/manual/en/domxpath.query.php then you should get two nodes in the returned node list. – Martin Honnen Mar 03 '11 at 16:41
0

To retrieve all the descendants I advise using the // notation. This will return all text descendants below an element. Below is an xquery snippet that gets all the descendant text nodes and formats it like Martin indicated.

xquery version "1.0";
let $a := 
<a>
  apples  
  <b><c/></b>
  pears
</a>
return normalize-space(string-join($a//text(), " "))

Or if you have your own formatting requirements you could start by looping through each text element in the following xquery.

xquery version "1.0";
let $a := 
<a>
  apples  
  <b><c/></b>
  pears
</a>
for $txt in $a//text()
return $txt
Scott
  • 106
  • 2
  • 1
    You wrote _"To retrieve all the descendants [...]"_. That was not asked. –  Mar 03 '11 at 16:28
0

If I select a/text(), all i get is "apples". How would i retreive "apples pears"

Just use:

normalize-space(/)

Explanation:

The string value of the root node (/) of the document is the concatenation of all its text-node descendents. Because there are white-space-only text nodes, we need to eliminate these unwanted text nodes.

Here is a small demonstration how this solution works and what it produces:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="/">
  '<xsl:value-of select="normalize-space()"/>'
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<a>
 apples
    <b><c/></b>
 pears
</a>

the wanted, correct result is produced:

  'apples pears'
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431