1

I´m new to stack overflow, so "hi" to everbody and I hope someone can help me with my question...

Recently I start to play around a bit with lxml.objectify and stumble over the following behavior, which I found to be strange. If I just create a little xml string like this:

from lxml import objectify 

objroot = objectify.fromstring("<root>somerootvalue<child1/><child2/><child3/><child4><subchild1/><subchild2/></child4><child5><subchild1/><subchild2/></child5></root>")
objroot.child4 = True
objroot.child4.subchild1 = "Foo"
objroot.child4.subchild2 = "Bar"
print(objroot.child4.subchild1,objroot.child4.subchild2,objroot.child4)

The output is just: Foo Bar true

If I change the value/text of the elements then by:

from lxml import objectify 

objroot = objectify.fromstring("<root>somerootvalue<child1/><child2/><child3/><child4><subchild1/><subchild2/></child4><child5><subchild1/><subchild2/></child5></root>")
objroot.child4 = True
objroot.child4.subchild1 = "Foo"
objroot.child4.subchild2 = "Bar"
print(objroot.child4.subchild1,objroot.child4.subchild2,objroot.child4)
objroot.child4 = False
objroot.child4.subchild1 = "Foo"
objroot.child4.subchild2 = "Bar"
print(objroot.child4.subchild1,objroot.child4.subchild2,objroot.child4)

The output is as expected: Foo Bar true, Foo Baz false

But if I just change the value of objroot.child4 and call the print statement, I´ve got the following error:

from lxml import objectify 

objroot = objectify.fromstring("<root>somerootvalue<child1/><child2/><child3/><child4><subchild1/><subchild2/></child4><child5><subchild1/><subchild2/></child5></root>")
objroot.child4 = True
objroot.child4.subchild1 = "Foo"
objroot.child4.subchild2 = "Bar"
print(objroot.child4.subchild1,objroot.child4.subchild2,objroot.child4)
objroot.child4 = False
objroot.child4.subchild1 = "Foo"
objroot.child4.subchild2 = "Bar"
print(objroot.child4.subchild1,objroot.child4.subchild2,objroot.child4)
objroot.child4 = True
print(objroot.child4.subchild1,objroot.child4.subchild2,objroot.child4)
File "src\lxml\lxml.objectify.pyx", line 450, in lxml.objectify._lookupChildOrRaise (src\lxml\lxml.objectify.c:6586)
AttributeError: no such child: subchild1

While I expect the last output to be "Foo Bar true", I´ve got the "no such child error". So it seems that the remaining part of the tree behind child4 has been cut of? Is tha a desired behavior and if yes how can I change the text of an element in the middle of the tree without cutting the rest of?

Thank you for your help!

Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119

1 Answers1

1

It is actually not that strange of behavior once we dig into it a bit. Using root.element.subelement.... is not performing as you might assume it is. We can use etree to print out the state of the xml tree and check the structure.

from lxml import objectify, etree

objroot = objectify.fromstring("<root>somerootvalue<child1/><child2/><child3/><child4><subchild1/><subchild2/></child4><child5><subchild1/><subchild2/></child5></root>")

print(etree.tostring(objroot, pretty_print=True)

#output:
<root>somerootvalue
    <child1/>
    <child2/>
    <child3/>
    <child4><subchild1/><subchild2/></child4>
    <child5><subchild1/><subchild2/></child5>
</root>

This looks correct. So what happens when we call objroot.child4 = True? The API allows you to do this, but it does not just add the text. Rather, it replaces everything that was under child4 with the text. So the subelements get dropped. We can check using:

objroot.child4 = True
print(etree.tostring(objroot, pretty_print=True)

#output:
<root>somerootvalue
    <child1/>
    <child2/>
    <child3/>
    <child4 xmlns:py="..." py:pytype="bool">true</child4>
    <child5><subchild1/><subchild2/></child5>
</root>

So it has set the value of child4 to True, but it has dropped the subelements. After that, when you set the values of the subelements using:

objroot.child4.subchild1 = "Foo"
objroot.child4.subchild2 = "Bar"

It actually creates each subelement under child4 and then sets the value on the fly.

James
  • 32,991
  • 4
  • 47
  • 70