0

Firstly, I'm new to stack overflow so please be kind and offer constructive criticism on how to improve this question if there is room to do so.

The problem: I want to preserve the structure of the XML file created by the code below. I want it to look like this:

<?xml version="1.0" encoding="UTF-8" ?>
<save>
    <header version="2" />
    <version major="3" minor="6" revision="2" build="0" />
    <region id="ModuleSettings">
        <node id="root">
            <children>
                <node id="ModOrder">
                    <children>
                        <node id="Module">
                            <attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" />
                        </node>
                        <node id="Module">
                            <attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" />
                        </node>
                    </children>
                </node>
                <node id="Mods">
                    <children>
                        <node id="ModuleShortDesc">
                            <attribute id="Folder" value="somestuff1" type="30" />
                            <attribute id="MD5" value="somestuff1" type="23" />
                            <attribute id="Name" value="somestuff1" type="22" />
                            <attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" />
                            <attribute id="Version" value="1" type="4" />
                        </node>
                        <node id="ModuleShortDesc">
                            <attribute id="Folder" value="somestuff2" type="30" />
                            <attribute id="MD5" value="" type="23" />
                            <attribute id="Name" value="somestuff2" type="22" />
                            <attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" />
                            <attribute id="Version" value="2" type="4" />
                        </node>
                    </children>
                </node>
            </children>
        </node>
    </region>
</save>

but instead get this:

<?xml version="1.0" encoding="UTF-8" ?>
<save>
    <header version="2" />
    <version major="3" minor="6" revision="2" build="0" />
    <region id="ModuleSettings">
        <node id="root">
            <children>
                <node id="ModOrder">
                    <children>
                    <node id="Module"><attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" /></node><node id="Module"><attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" /></node></children>
                </node>
                <node id="Mods">
                    <children>
                    <node id="ModuleShortDesc"><attribute id="Folder" value="somestuff1" type="30" /><attribute id="MD5" value="somestuff1" type="23" /><attribute id="Name" value="somestuff1" type="22" /><attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" /><attribute id="Version" value="1" type="4" /></node><node id="ModuleShortDesc"><attribute id="Folder" value="somestuff2" type="30" /><attribute id="MD5" value="" type="23" /><attribute id="Name" value="somestuff2" type="22" /><attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" /><attribute id="Version" value="2" type="4" /></node></children></node>
            </children>
        </node>
    </region>
</save>

Focusing only on the ModOrder node, here is my current code:

# Create a Module element as object:
def new_module(uuid, ModOrder):

    ''' Example Module:
        <node id="Module">
            <attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" />
        </node>
    '''

    uuid = str(uuid)

    module = et.SubElement(ModOrder, "node")
    module.set("id", "Module")

    attribute_uuid = et.SubElement(module, "attribute")
    attribute_uuid.set("id", "UUID")
    attribute_uuid.set("value", uuid)
    attribute_uuid.set("type", "22")

    return module

def generator2():

    # mods_dictionary returns 2 lists of dictionaries:
    info = mods_dictionary(a1)

    # info[0] contains a list of dictionaries.
    # Each dictionary contains information of each mod pulled from meta.lsx file inside each pak
    data_list = info[0]
    # error_list = info[1] # Not needed

    # ModOrderTree = element tree object @ <node id="Module">
    ModOrderTree = tree.xpath('//node[@id="ModOrder"]')[0]

    # ModOrder = element tree object @ <children>   
    ModOrder = ModOrderTree.find('children')

    # For each dictionary inside data_list
    for mods in data_list:
        order = new_module(mods["UUID"],ModOrder)
        desc = new_moduleshortdesc(mods["Name"], mods["Author"], mods["Version"], mods["UUID"], mods["Folder"])

    # Then write to file:    
    tree.write('testwrite.xml')

generator2()

Questions Is there a way to achieve what I want?

Please bear in mind that I'm new to programming, very much still learning so I'm sure there are more pythonic ways to write the code more efficiently. Feel free to make suggestions if I've done anything noobish to bother you :p

Things tried:

    t1 = et.tostring(tree, encoding="unicode",method="xml",pretty_print=True)
    with open(test_file,'w') as f:
        f.write(t1)
tree.write('testwrite.xml', pretty_print=True). 
Rykari
  • 144
  • 6
  • To be more specific, you want to preserve the line breaks and indentation? That would be a question about how to serialize your XML document into a pretty format, right? Find out if your XML tools support xml:space attribute but it'd probably be more efficient to not have any extra space there, wouldn't it? – Miro Lehtonen Sep 23 '19 at 12:25
  • `tree.write('testwrite.xml', pretty_print=True)` didn't work unfortunately :( Yes that's right Miro, so the technical name for what I'm asking about is Serialization? Thanks that really helps, I'll look into it and xml:space. I'm using lxml currently. Yes it would be more efficient but I want to preserve it all for readability. This function will be adding, removing and modifying positions for potentially hundreds of nodes. Removing that white space makes any errors impossible to see. – Rykari Sep 23 '19 at 12:35
  • I think your problem is similar to these: https://stackoverflow.com/q/7903759/407651, https://stackoverflow.com/q/5086922/407651. – mzjn Sep 23 '19 at 13:23
  • I could kiss you :3 – Rykari Sep 23 '19 at 14:01
  • Possible duplicate of [Pretty print in lxml is failing when I add tags to a parsed tree](https://stackoverflow.com/questions/7903759/pretty-print-in-lxml-is-failing-when-i-add-tags-to-a-parsed-tree) – mzjn Sep 23 '19 at 14:06
  • I've no idea how I missed those, I googled the shizzle out of this problem for hours. Thanks for the help – Rykari Sep 23 '19 at 14:09

1 Answers1

0

Solution: (Thanks to woodm1979 Python pretty XML printer with lxml)

Just remove every bit of white space from the whole document then have a parser reformat it properly:

def reformat(file):
    generator2()

    parser = et.XMLParser(remove_blank_text=True)
    tree = et.parse(test_file,parser)
    tree.write(test_file, encoding='utf-8',pretty_print=True,xml_declaration=True)
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
Rykari
  • 144
  • 6