0

I'm trying to add an attribute value to a tag in an XML string when the attribute is missing, and print out the modified string. Code is patchwork that I've roughly gleaned from various sources, so apologies if this isn't best practice.

Here's what the code looks like-

import xml.etree.ElementTree as ET
import time

xml_string = """
<message>
    <info xmlns="urn:xmpp:info" />
</message>
"""

root = ET.fromstring(xml_string)
ns = {'ns': 'urn:xmpp:info'}
for info in root.findall(".//ns:info", ns):
    sent_time = info.attrib.get("sent_time_millis")
    if not sent_time:
        sent_time = str(int(time.time() * 1000))
        info.attrib["sent_time_millis"] = sent_time
    print(ET.tostring(root, encoding='unicode'))
    break

This results in something like this when run-

<message xmlns:ns0="urn:xmpp:info">
    <ns0:info sent_time_millis="1675533312576" />
</message>

When the desired output should be something like this-

<message>
    <info xmlns="urn:xmpp:info" sent_time_millis="1675533086777" />
</message>

I'm missing something basic, I'm sure. How should I go about this modification?

Thanks.

mzjn
  • 48,958
  • 13
  • 128
  • 248
Athena
  • 155
  • 7
  • 1
    Those two XML documents are 100% equivalent. They both result in the `info` element in namespace `urn:xmpp:info` having the `sent_time_millis` attribute. – larsks Feb 04 '23 at 18:19
  • I'm trying to keeep it a 100% one to one char for char- our current setup is really finnicky, and even slight deviations from expected values throw the entire system for a toss, and I can't really test how this looks like on our front-end. Apologies for how stupid that sounds. – Athena Feb 04 '23 at 18:33
  • 1
    If you use lxml instead of ElementTree, you will get the wanted output. Related question: https://stackoverflow.com/q/45990761/407651 – mzjn Feb 04 '23 at 19:28
  • This works for my use case- do you want to answer the question so I can mark it as such @mzjn? – Athena Feb 05 '23 at 06:03

1 Answers1

1

ElementTree fiddles with namespaces in a way that can be annoying (even though the semantics does not change). lxml does not do this, so a solution is to use that library instead.

Install lxml and change import xml.etree.ElementTree as ET to from lxml import etree as ET in the code.

mzjn
  • 48,958
  • 13
  • 128
  • 248