1

I am using lxml to parse a sample html. like this:

__dom = lxml.html.fromstring("<html><body><div id='mydiv'></div></body></html>")

I am trying to get the id of an element that I added to the html programatically, like this:

mydiv = __dom.get_element_by_id('mydiv')
mydiv.text = "<p id='myInner'>this is the inner inner text</p>"
myInner= __dom.get_element_by_id("myInner")

When adding the P it IS added. But when trying to get it back with get_element_by_id I am getting keyError on myInner.

I am guessing that since I added the P as text - it is no parsed as an HTML element and therefore I can not get it.

So my question is really: How to add/modify the innerHTML of an element using lxml?

Thanks

1 Answers1

1

as you said you are passing a string to the text attribute of div. I assume what your trying to do is to add a new P tag element as a child of the div element. You can parse your string into am lxml format then add it into the existing html as part of the tree

import lxml.html

__dom = lxml.html.fromstring("<html><body><div id='mydiv'></div></body></html>")

mydiv = __dom.get_element_by_id('mydiv')
myhtml = lxml.html.fromstring("<p id='myInner'>this is the inner inner text</p>")
mydiv.insert(0, myhtml)
print(lxml.html.tostring(__dom))

OUTPUT

<html><body><div id="mydiv"><p id="myInner">this is the inner inner text</p></div></body></html>
Chris Doyle
  • 10,703
  • 2
  • 23
  • 42