I think you have misdiagnosed the situation, and the reason for the misdiagnosis (to stretch an analogy much too far) is that you've looked at the symptoms of about 7 patients rather than going to medical school and learning about anatomy.
The "anatomy" here is the XDM data model which underpins the semantics of XPath. Note in particular that
(a) when you have a structure like this
<title>Water</title>
there is an element node, whose string value is "Water", and which is the parent of a single text node, whose string value is also "Water".
(b) when you have a structure like this
<title>H<sub>2</sub>O</title>
there is an element node, whose string value is "H2O", which is the parent of three children: a text node with string value "H", an element node with string value "2" (which itself is the parent of another text node...), and a second text node with string value "O".
In case (a) nearly all operations produce the same result whether applied to the element node or the text node. For example contains($x, "ate")
will be true whether $x
is the element node or the text node. So adding /text()
to the path is generally redundant: it does no harm, but it's unnecessary. We often advise against doing it, because it makes your code more fragile if the structure of the data later changes, quite apart from just adding unnecessary verbosity.
In case (b) adding /text()
to your path causes you to select the two text nodes "H" and "O" instead of selecting the element node. In XPath 1.0, many operations (such as contains()
) when applied to a sequence of two text nodes ignore all but the first, so contains(x/y/title/text(), "O")
will return false; in XPath 2.0 it will throw an error saying that the argument to contains() must be a singleton. If you simply want to know whether the title contains the letter "O", then it's much better to leave out the /text()
and apply the operation to the string value of the element, which is the concatenation of all the text nodes.
The only time you need to use "/text()" is if you want to probe more deeply into the internal structure of the title
element.
It is of course possible that there are differences between XPath implementations - not all of them have 100% conformance to the standard. But the mainstream implementations are pretty compatible, and if you find a difference, please tell us about it: be explicit about the source document, the path expression, and the different results obtained in different implementations.