0

The problem I have is this. I've started the XML creation using the dictionary structure used by xmltodict Python package so I can use the unparse method to create the XML. But I think I reached a point where xmltodict can't help me. I have actions in this dictionary format, highly nested each, something like this, just much more complex:

action = {
   "@id": 1,
   "some-nested-stuff":
       {"@attr1": "string value", "child": True}
}

Now I need to group some actions similar to this:

<action id=1>...</action>
<action-group groupId=1>
  <action id=2>...</action>
  <action id=3>...</action>
</action-group>
<action id=4>...</action>

And yes, the first action needs to go before the action group and the fourth action after it. It seems impossible to do it with just xmltodict. I was thinking that I create the actions' XML tree as an lxml object from these dictionaries, and than I merge those objects into a whole XML. I think that it wouldn't be a big task, but there might be a ready package for that. Is there one?

The alternative solution — that I try to avoid if possible — is to rewrite the project from scratch using just lxml. Or is there a way to create that XML using just xmltodict but not the xml/lxml packages?

  • Something is unclear to me: are you starting with an xml, converting it to dictionary and now trying to convert it back to a modified xml? – Jack Fleeting Oct 20 '22 at 15:15
  • No. I have extracted an XML Schema of 600 action types to JSON. Based on that the frontend gets the data. From that data I create an XML. Just I've started with xmltodict and I'm not sure whether rewriting all the ready codes and their unit tests makes sense or not. The code must be simpler though if I rewrote excluding xmltodict. – Arpad Horvath -- Слава Україні Oct 21 '22 at 06:20

1 Answers1

0

It seems that no such package. So far I have this solution. I doesn't handle #text keys and there can be problems with namespaces.

"""
Converts the dictionary used by xmltodict package to represent XMLs
to lxml.
"""
from typing import Dict, Any

from lxml import etree

XmlDictType = Dict[str, Any]
element = etree.Element("for-creating-types")
ElementType = type(element)
ElementTreeType = type(etree.ElementTree(element))


def convert(xml_dict: XmlDictType) -> ElementType:
    root_name = list(xml_dict)[0]
    inside_dict = xml_dict[root_name]
    attrs, children = split_attrs_and_children(inside_dict)
    root = etree.Element(root_name, **attrs)
    convert_children(root, children)
    return root


def split_attrs_and_children(xml_dict: XmlDictType) -> ElementType:
    """Split the categories and fix the types"""
    def fix_types(v):
        if isinstance(v, (int, float)):
            return str(v)
        elif isinstance(v, bool):
            return {True: "true", False: "false"}[v]
        else:
            return v
        
    attrs = {k[1:]: fix_types(v) for k, v in xml_dict.items() if k.startswith("@")}
    children = {k: fix_types(v) for k, v in xml_dict.items() if not (k.startswith("@") or k.startswith("#"))}
    return attrs, children


def convert_children(parent: ElementType, children: XmlDictType) -> ElementType:
    for child_name, value in children.items():
        if isinstance(value, dict):
            attrs, children = split_attrs_and_children(value)
            child = etree.SubElement(parent, child_name, **attrs)
            convert_children(child, children)
        elif isinstance(value, list):
            for v in value:
                child = etree.SubElement(parent, child_name).text = v
        else:
            child = etree.SubElement(parent, child_name).text = value
    return parent

You can convert for example this dictionary:

xml_dict = {
    "mydocument": {
        "@has": "an attribute",
        "and": {
            "many": [
                "elements",
                "more elements"
            ]
        },
        "plus": {
            "@a": "complex",
            "#text": "element as well"
        }
    }
}

Note that the #text line is not included yet.