19

I’ve created an XML document using xml.etree.elementtree.Element, and wanted to print it using the ElementTree.write() function but the declaration tag that comes out is

<?xml version='1.0' encoding='UTF-8'?>

While I need to be in double quotes. is there a way to change that?

Bg1987
  • 1,129
  • 1
  • 11
  • 25
  • Uh...why do you need double quotes? – Tim Pietzcker May 06 '12 at 14:45
  • why does this even matter? it's equally valid xml with single or with double quotes. – mata May 06 '12 at 14:46
  • 3
    because its an assignment and for some reason the teacher does a diff on the xml. Instead of comparing the elements. – Bg1987 May 06 '12 at 15:04
  • 3
    the single quotes are hardcoded in the `write` method, so it's not possible to change them. all you can do is alter them afterwords. – mata May 06 '12 at 17:01
  • 2
    @TimPietzcker here's an example from the wild: [Moodle checks for a raw declaration string](https://github.com/moodle/moodle/blob/master/backup/util/helper/convert_helper.class.php#L151) (including double-quotes) and will treat a backup file as invalid if it doesn't find it in the right place. – supervacuo Jan 11 '15 at 22:34

5 Answers5

5

I had the same problem, looked in the code of the ElementTree.py and saw the following.

For the root tag (single quotes):

        if method == "xml":
            write("<?xml version='1.0' encoding='%s'?>\n" % encoding)

And for the attributes (double quotes):

write(" %s=\"%s\"" % (qnames[k], v))

It's hardcoded that way...

I changed it (locally) to:

"<?xml version=\"1.0\" encoding=\"%s\"?>\n"

So every attribute is double quoted now.

Stephan Schielke
  • 2,744
  • 7
  • 34
  • 40
  • How do you did that? – Braian Oct 12 '22 at 13:38
  • @Braian It is super hacky and not recommended but I went into my virtual environment and changed the package locally (where it was downloaded, build/installed). Any future update of the package would override your local changes. It is also not part of your CI. For a real change, I would raise a change request to the package maintainers. – Stephan Schielke Aug 10 '23 at 11:56
4

Eventually I used the tostring function and appended the XML to the correct tag and then the python file.write function. It's ugly (and im lying about the actual encoding of the file) but it works.

Bg1987
  • 1,129
  • 1
  • 11
  • 25
3

I did the same as bg1987. Here is the function I wrote in case is useful to somebody

def wrTmp(treeObject, filepath):
    xml_str = ('<?xml version="1.0" encoding="UTF-8"?>' + '\n' + xml.etree.ElementTree.tostring(treeObject.getroot(), method='xml'))
    with open(filepath, 'wb') as xml_file:
         xml_file.write(xml_str)
Josep Bosch
  • 848
  • 1
  • 12
  • 29
0

I had to do pretty much the same thing, except the other way around, due to hacks in various $workplace tools that demand single quotes where python's ElementTree.write puts in double quotes. (A bit of code looks for the literal string status='ok' and does not recognize status="ok". Yes, that code is broken—in several ways, actually—but I just have to work around it.)

Fortunately "user data" single or double quotes are encoded as &apos; and &quot; (respectively). In my case I was already using tostring rather than write (for other reasons), so I have:

import xml.etree.ElementTree as ET
# [... mass snippage]
        text = ET.tostring(reply).replace('"', "'")
        # [... snippaage]
        self.wfile.write(text)

(Obviously you'll want replace("'", '"') instead.)

torek
  • 448,244
  • 59
  • 642
  • 775
  • The problem is that some single quoted values may contain double quotes. Simple replacement may break the content. – pepr May 06 '12 at 19:08
  • @pepr: I'll grant the theoretical possibility, but in all my tests (which of course were specific to my data, and perhaps the installed python libraries as well), `xml.etree.ElementTree` always encoded double quotes as `"` if they were permitted in that position at all. – torek May 06 '12 at 23:33
0

Another way would be (if you also want to pretty print your xml) to use the minidom.toprettyxml(encoding="UTF-8") which puts the xml and encoding declaration in double quotes:

from xml.dom import minidom
xmlDom = minidom.parse("my_file.xml")

prettyXML = xmlDom.toprettyxml(encoding="UTF-8")

myfile = open(fileName, mode="wb")
myfile.write(prettyXML)
myfile.close()
RedJohn
  • 314
  • 3
  • 7