lxml: get element with a particular child element?

Question

Working in lxml, I want to get the href attribute of all links with an img child that has title="Go to next page".

So in the following snippet:

<a class="noborder" href="StdResults.aspx">
<img src="arrowr.gif" title="Go to next page"></img>
</a>

I'd like to get StdResults.aspx back.

I've got this far:

next_link = doc.xpath("//a/img[@title='Go to next page']") 
print next_link[0].attrib['href']

But next_link is the img, not the a tag - how can I get the a tag?

Thanks.

possible duplicate of [XPath : Get nodes where child node contains an attribute](http://stackoverflow.com/questions/1457638/xpath-get-nodes-where-child-node-contains-an-attribute) — Katriel, Jul 31 '11 at 20:59

score 2 · Accepted Answer · answered Jul 31 '11 at 20:57

2

Just change a/img... to a[img...]: (the brackets sort of mean "such that")

import lxml.html as lh

content='''<a class="noborder" href="StdResults.aspx">
<img src="arrowr.gif" title="Go to next page"></img>
</a>'''

doc=lh.fromstring(content)
for elt in doc.xpath("//a[img[@title='Go to next page']]"):
    print(elt.attrib['href'])

# StdResults.aspx

Or, you could go even farther and use

"//a[img[@title='Go to next page']]/@href"

to retrieve the values of the href attributes.

answered Jul 31 '11 at 20:57

unutbu

842,883
184
1,785
1,677

thanks, I always thought a[@..] could only specify the attributes. actually I wonder if there is any good place for a reference or samples of lxml for such confusions? – Walty Yeung May 26 '12 at 13:04

score 0 · Answer 2 · answered Jul 31 '11 at 21:02

0

You can also select the parent node or arbitrary ancestors by using //a/img[@title='Go to next page']/parent::a or //a/img[@title='Go to next page']/ancestor::a respectively as XPath expressions.

answered Jul 31 '11 at 21:02

lxml: get element with a particular child element?

2 Answers2

Linked