40

This is such a basic question that I actually can't find it in the docs :-/

In the following:

img = house_tree.xpath('//img[@id="mainphoto"]')[0]

How do I get the HTML of the <img/> tag?

I've tried adding html_content() but get AttributeError: 'lxml.etree._Element' object has no attribute 'html_content'.

Also, it was a tag with some content inside (e.g. <p>text</p>) how would I get the content (e.g. text)?

Many thanks!

AP257
  • 89,519
  • 86
  • 202
  • 261

1 Answers1

68

I suppose it will be as simple as:

from lxml.etree import tostring
inner_html = tostring(img)

As for getting content from inside <p>, say, some selected element el:

content = el.text_content()
Ninjakannon
  • 3,751
  • 7
  • 53
  • 76
vonPetrushev
  • 5,457
  • 6
  • 39
  • 51