Running Python 3.7.4 on Windows, I notice that XPath evaluation differs from the results of online evaluators, such as here or here.
Online evaluators allow entering a relative expression, which will be evaluated on the entire document. However, with lxml I get no matches on the element tree unless I make it an absolute expression by prepending a slash.
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
>>> import lxml.etree
>>> root = lxml.etree.fromstring('''
... <TestRootNode>
... <person personID="person1">
... <name>James</name>
... </person>
... <person personID="person2">
... <name>Cathy</name>
... </person>
... </TestRootNode>''')
>>> tree = root.getroottree()
>>> tree.xpath('/TestRootNode/person')
[<Element person at 0x2ceee1f4e88>, <Element person at 0x2ceee1ff048>]
>>> tree.xpath('string(/TestRootNode/person[1])')
'\n James\n '
>>> tree.xpath('TestRootNode/person')
[]
>>> tree.xpath('string(TestRootNode/person[1])')
''
I hvae two questions:
Who is right, the online evaluators or lxml? Is it allowed to apply a relative expression in the context of the whole document?
In case the online evaluators are right: Is there a simple way to make lxml behave in the same way? Simply putting a slash at the beginning of the string won't work, as you can see from my example with the
string()
function.