Assume we have the following html:
<html>
<body>
<a href="/1234.html">TEXT A</a>
<a href="/3243.html">TEXT B</a>
<a href="/7445.html">TEXT C</a>
<body>
</html>
How do I make it find the element "a", which contains "TEXT A"?
So far I've got:
root = lxml.html.document_fromstring(the_html_above)
e = root.find('.//a')
I've tried:
e = root.find('.//a[@text="TEXT A"]')
but that didn't work, as the "a" tags have no attribute "text".
Is there any way I can solve this in a similar fashion to what I've tried?