I have the following xml :
<test1>
<test2>
<text>This is a question on xpath
</text>
</test2>
<test3>
<test2>
<text>Do not extract this
</text>
</test2>
</test3>
</test1>
I need to extract text within test2/text
but not if test2
comes inside test3
. How can this be done in xpath ? I tried with findall
with something like:
for p in lxml_tree.xpath('.//test2',namespaces={'w':w}):
for q in p.iterancestors():
if q.tag=="test3":
break
else:
text+= ''.join(t.text for t in p.xpath('.//text'))
but this doesn't work . I guess xpath has a better way in a single expression to exclude it.
Expected output:
text = "This is a question on xpath"