0

when I launch this code he generated xml file which contain xml version <\?xml version="1.0" ?>, I tried exclude this line using xml_declaration=False, but error appears:

TypeError: prettify() got an unexpected keyword argument 'encoding'

How I can cut this string from my xml file ?

from xml.etree import ElementTree
from xml.dom import minidom
from lxml.etree import Element, SubElement



def prettify(templateXml):
    rough_string = ElementTree.tostring(templateXml)
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="\t")


top = Element('Options')
element = SubElement(top, 'Some ID')
element.text = ' '
element = SubElement(top, 'Test0')
element.text = 'Some text'
SubElement(top, 'Test1', {'enabled': 'true', 'Values': 'true'})
SubElement(top, 'Test2', {'enabled': 'true', 'Values': 'true'})
SubElement(top, 'Test3', {'enabled': 'true', 'Values': 'true'})
SubElement(top, 'Test4', {'enabled': 'true', 'Test5': 'true', 'Zero': 'true'})
SubElement(top, 'Test6', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
SubElement(top, 'Test7', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
SubElement(top, 'Test8', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
with open("output/some_xml_file.xml", 'w') as f:
    f.write(prettify(top))
Oleg
  • 11
  • 1
  • You need to tell the write or prettify method what encoding you want to use I suspect. Look at the arguments to these methods I bet there is a argument encoding and I would set it to "UTF-8" – Rob Oct 03 '17 at 14:54
  • 1
    return reparsed.toprettyxml(indent="\t", encoding='UTF-8') the same only added – Oleg Oct 03 '17 at 14:58

3 Answers3

0

Issue solved following changes:

from xml.etree import ElementTree
from xml.dom import minidom
from xml.etree.ElementTree import Element, SubElement


def prettify(elem):
    xml = ElementTree.tostring(elem)
    reparsed = minidom.parseString(xml)
    return reparsed.toprettyxml(indent="\t")


def strip_prologue(xml):
    if xml.startswith("<?xml"):
        return xml[xml.index(">") + 1:].lstrip()
    else:
        return xml


def generate_xml():
    top = Element('Heards')
    element = SubElement(top, 'SomeID')
    element.text = ' '
    element = SubElement(top, 'Test0')
    element.text = 'Some text'
    SubElement(top, 'Test1', {'enabled': 'true', 'Values': 'true'})
    SubElement(top, 'Test2', {'enabled': 'true', 'Values': 'true'})
    SubElement(top, 'Test3', {'enabled': 'true', 'Values': 'true'})
    SubElement(top, 'Test4', {'enabled': 'true', 'Test5': 'true', 'Zero': 'true'})
    SubElement(top, 'Test6', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
    SubElement(top, 'Test7', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
    SubElement(top, 'Test8', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
    with open("output/some_xml_file.xml", 'w') as f:
        f.write(strip_prologue(prettify(top)))
Oleg
  • 11
  • 1
0

Simply use lxml's prettyprint argument in tostring(). No need for minidom or even xml.etree. Python's lxml can serve as your full XML handler. And be sure to remove the space in Some ID for valid XML names for well-formedness.

import lxml.etree as et
from lxml.etree import Element, SubElement

top = Element('Options')
element = SubElement(top, 'SomeID')
element.text = ' '
element = SubElement(top, 'Test0')
element.text = 'Some text'
SubElement(top, 'Test1', {'enabled': 'true', 'Values': 'true'})
SubElement(top, 'Test2', {'enabled': 'true', 'Values': 'true'})
SubElement(top, 'Test3', {'enabled': 'true', 'Values': 'true'})
SubElement(top, 'Test4', {'enabled': 'true', 'Test5': 'true', 'Zero': 'true'})
SubElement(top, 'Test6', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
SubElement(top, 'Test7', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})
SubElement(top, 'Test8', {'enabled': 'true', 'Values': 'true', 'Zero': 'true'})

with open("output/some_xml_file.xml", 'wb') as f:
    f.write(et.tostring(top, xml_declaration=True, pretty_print=True, encoding="utf-8"))

Output

<?xml version='1.0' encoding='utf-8'?>
<Options>
  <SomeID> </SomeID>
  <Test0>Some text</Test0>
  <Test1 Values="true" enabled="true"/>
  <Test2 Values="true" enabled="true"/>
  <Test3 Values="true" enabled="true"/>
  <Test4 Test5="true" Zero="true" enabled="true"/>
  <Test6 Values="true" Zero="true" enabled="true"/>
  <Test7 Values="true" Zero="true" enabled="true"/>
  <Test8 Values="true" Zero="true" enabled="true"/>
</Options>
Parfait
  • 104,375
  • 17
  • 94
  • 125
0

I was stuck with a similar error and until I could find a good solution to do this, I used a not-so-great method that parsed my un-indented data and used the tostring method without using minidom.

from lxml import etree

tree = lxml.etree.parse("yourfile.xml")
pretty = lxml.etree.tostring(tree, encoding="unicode", pretty_print=True)

print(pretty) 

This worked for me when I got the same error so just putting it out there.