How do I perform lxml xpath searches through the results of a previous xpath search?

Question

If you run the following Python code, you'll notice that it prints all of the tag references in the entire document, when it should only print 1.

How can you use xpath to first) search for article tags, and second) search for links within them?

from lxml import html

source = '''
<body>
    <a href='www.google.com'>outside 1</a>

    <article class='art'>
        <a href='www.google.com'>inside 1</a>
    </article>

    <article class='art'>
        <a href='www.google.com'>inside 2</a>
    </article>

    <a href='www.google.com'>outside 2</a>
</body>
'''

tree_html = html.fromstring(source)
articles = tree_html.xpath('//article')
first_articles_a_text = articles[0].xpath('//a')

print first_articles_a_text

Output:

[<Element a at 0x47b05e8>, <Element a at 0x47b0598>, <Element a at 0x47b07c8>, <Element a at 0x47b0818>]

Note : I could not find a similar answer anywhere on SO or online. Forgive me if I missed one.

score 1 · Accepted Answer · edited May 23 '17 at 11:57

1

Start your xpath expression with a dot. This would make it search in the scope of the element:

first_articles_a_text = articles[0].xpath('.//a')

See also:

Python: Using xpath locally / on a specific element

edited May 23 '17 at 11:57

Community

1
1

answered Aug 27 '14 at 01:03

alecxe

462,703
120
1,088
1,195

So each of the elements in the list [articles] really references the entire tree? (And thanks so much, this was a couple hour nightmare for me) – Josh.F Aug 27 '14 at 02:22

How do I perform lxml xpath searches through the results of a previous xpath search?

1 Answers1