6

I have the following xml document:

<?xml version="1.0" encoding="UTF-8"?>
<root>
<data>
<child1>&#160;Well, some  spaces and nbsps  &#160;</child1>
<child2>&#160; some more                  &#160;  or whatever          </child2>
<child3>         a nice text</child3>
<child4>how                              to get rid of all the nasty spaces&#160;          ?                                  </child4>
</data>
</root>

I have to remove all non-breakable spaces, concatenate the text and nomalize it.

My xpath query (it works fine for concatenation and normalization - I have inserted the replacement with 'x' only for test purposes):

normalize-space(replace(string-join(//data/*,' '),'&#160;','x'))

My problem: I can't find the "&#160;"-whitespace to replace it.

Looking forward to your answers,

user1800825
  • 203
  • 1
  • 4
  • 17

1 Answers1

8

The string value of an element node is defined to be the concatenation of all its descendant text nodes, so in an XSLT transformation

normalize-space(translate(//data, '&#160;', ''))

would do what you require, assuming your document only contains one data element - if there is more than one data element then this expression will only extract and normalize the text of the first data element in the document.

If you are using the XPath expression somewhere other than in an XSLT file then you will need to represent the non-break space character differently. The above example works because the XML parser converts the &#160; character reference into a non-break space character when reading the .xsl file, so the XPath expression parser sees the character, not the reference. In Java, for example, I could say

XPath.evaluate("normalize-space(translate(//data, '\u00A0', ''))", contextNode)

because \u00A0 is the way to represent the nbsp character in a Java string literal. If you are using another language you need to find the right way to represent this character in that language, or if you're using XPath 2.0 you could use the codepoints-to-string function:

normalize-space(translate(//data, codepoints-to-string(160), ''))
Ian Roberts
  • 120,891
  • 16
  • 170
  • 183
  • thanks for your answer, in this particular example I have only one data node: – user1800825 Nov 06 '12 at 15:18
  • thanks for your anwer, in this example I have only one data node so normalize-space(translate(//data, ' ', '')) shoud be sufficient. I am writing a simple xpath2 query - as a tool a use XMLSpy. I tried: `' , '\u00A0', and '$nbsps'` - but non of these remove the spaces. – user1800825 Nov 06 '12 at 15:30
  • @user1800825 I've edited my answer with an example of using the `codepoints-to-string` XPath 2.0 function. – Ian Roberts Nov 06 '12 at 15:48
  • 1
    Thanks. `translate` and `'\u00A0'` work for me in Google Chrome browser. – hungndv Oct 18 '17 at 02:51