Not Condition In xpath

Question

I have the following xml :

<test1>
    <test2>
       <text>This is a question on xpath
       </text>
    </test2>
    <test3>
        <test2>
            <text>Do not extract this
             </text>
        </test2>
    </test3>
</test1>

I need to extract text within test2/text but not if test2 comes inside test3. How can this be done in xpath ? I tried with findall with something like:

for p in lxml_tree.xpath('.//test2',namespaces={'w':w}):
    for q in p.iterancestors():
        if q.tag=="test3":
           break
        else:
            text+= ''.join(t.text for t in p.xpath('.//text'))

but this doesn't work . I guess xpath has a better way in a single expression to exclude it.

Expected output:

text = "This is a question on xpath"

score 3 · Accepted Answer · answered Dec 13 '14 at 09:24

3

Assuming by comes inside you mean any level of parent, you can use not with the ancestor axis to check to see whether a node does not have a specific parent / ancestor:

//test2[not(ancestor::test3)]/text

If however you meant immediate parent should not be test3, then switch ancestor for parent:

//test2[not(parent::test3)]/text

answered Dec 13 '14 at 09:24

StuartLC

104,537
17
209
285

I'm no pythonista, but the result is a `nodeset`, and lxml seems a robust library, so I would imagine this can be used as `for p in lxml_tree.xpath('.//test2[not(ancestor::test3)]/text')` – StuartLC Dec 13 '14 at 09:34

Not Condition In xpath

1 Answers1