3

I tried xml_marshaller as follows:

from xml_marshaller import xml_marshaller

class Person:
    firstName = "John"
    lastName = "Doe"

person1 = Person()
strXmlPerson = xml_marshaller.dumps(person1);
print(strXmlPerson)

The output from above is:

b'<marshal><object id="i2" module="__main__" class="Person"><tuple/><dictionary id="i3"><string>firstName</string><string>John</string><string>lastName</string><string>Doe</string></dictionary></object></marshal>'

which when formatted looks like this, which in my opinion is the ugliest XML possible:

b'<marshal>
    <object id="i2" module="__main__" class="Person">
        <tuple/>
        <dictionary id="i3">
            <string>firstName</string>
            <string>John</string>
            <string>lastName</string>
            <string>Doe</string>
        </dictionary>
    </object>
</marshal>'

What is the b and quotes doing there? Means "binary" maybe? Is that really part of the data, or just a side effect of printing it to the console?

Is there any Python 3 library that will create something more closer to "human" like this:

<Person> 
   <firstname>John</firstname>
   <lastname>Doe<lastname>
</Person> 

I'm looking for something close to what .NET creates (see http://mylifeismymessage.net/xml-serializerdeserializer/.

Please don't tell me try JSON or YAML, that's not the question. I might want to run the file through XSLT for example.

Update 2 days later:

I like the Peter Hoffman answer here: How can I convert XML into a Python object?

person1 = Person("John", "Doe")
#strXmlPerson = xml_marshaller.dumps(person1);
person = objectify.Element("Person")
strXmlPerson = lxml.etree.tostring(person1, pretty_print=True)
print(strXmlPerson)

gives error:

TypeError: Type 'Person' cannot be serialized.

In my scenario I might already have a class structure, and don't want to switch to they way they are doing it You can I serialize my "Person" class?

NealWalters
  • 17,197
  • 42
  • 141
  • 251
  • 1
    Python has `bytes` objects that represent 8 bit octets (the `b`) and `str` objects that are unicode code point characters. A `str` object needs to be serialized to a `bytes` object before it can be written to disk. This is done with `"mystring".encode("utf-8")` for instance. The object encoding itself is a little weird, but if it decodes properly on the other side and doesn't need to conform to an xml specification, does it really matter? – tdelaney Jun 28 '20 at 18:34
  • @tdelaney I'm XML thru and thru, and like use XSLT, BizTalk etc... I'm trying to reproduce the scenario I did in C# for Python folks in a course I'm doing on EDI/X12. Looks like maybe Gnosis would work, but it's only Python2, and hard to believe nobody has updated for Python3 if it is a good library. I just updated question with what I tried with lxml.objectify, but still not there. – NealWalters Jun 28 '20 at 18:41
  • I did a lot of XML back in the day and still prefer a bit of XSLT to a zillion lines of python for data transformation, so I hear you. Getting XML into python objects is rather code intensive - there's a reason JSON swept in and took the prize. The `b` and quotes themselves are just python's way of visualizing that the content is a bytes object - they are not in the actual octets. I peeked at lxml::objectify and it looked interesting. – tdelaney Jun 28 '20 at 18:46
  • If I serialize to JSON, is there then a library to convert from JSON to XML? I think JSON is also overrated unless you are programming in JavaScript. XSLT live and well for integration developers. I've been studying up on Saxonica, since .NET doesn't support XSLT 3.0. – NealWalters Jun 28 '20 at 18:47
  • The advantage of JSON is that it will serialize (relatively compactly) basic object types that are generally deserializable in any langauge. XML is comparatively bulky and its harder to get data back into programming data structures. The fact any given XML data serialization standard can do more than a typical JSON is kinda bad - its harder to hack it into any given language scenario. – tdelaney Jun 28 '20 at 18:55
  • It would be easy to serialize JSON to xml and I bet there is a library out there, but don't know for sure. Its an interesting take on JSON data transformation... worth a look! – tdelaney Jun 28 '20 at 18:55

1 Answers1

6

The reason the output is showing xml as a dictionary is most likely because the properties don't have a reference to the class. You should consider using self. and assigning values within a __init__ function.

class Person:
    def __init__(self):
        self.firstName = "John"
        self.lastName = "Doe"

There are many ways to convert an object to XML. However try using the package dicttoxml. As the name suggest you'll need to convert object to a dictionary, which can be done using vars().

The full solution:

from dicttoxml import dicttoxml

class Person:
    def __init__(self):
        self.firstName = "John"
        self.lastName = "Doe"

person = vars(Person()) # vars is pythonic way of converting to dictionary
xml = dicttoxml(person, attr_type=False, custom_root='Person') # set root node to Person
print(xml)

Output:

b'<?xml version="1.0" encoding="UTF-8" ?><Person><firstName>John</firstName><lastName>Doe</lastName></Person>'

If you want to format the XML nicely, then you can use the built in xml.dom.minidom.parseString library.

from dicttoxml import dicttoxml
from xml.dom.minidom import parseString

class Person:
    def __init__(self):
        self.firstName = "John"
        self.lastName = "Doe"

person = vars(Person()) # vars is pythonic way of converting to dictionary
xml = dicttoxml(person, attr_type=False, custom_root='Person') # set root node to Person
print(xml)

dom = parseString(xml)
print(dom.toprettyxml())

Output:

<?xml version="1.0" ?>
<Person>
   <firstName>John</firstName>
   <lastName>Doe</lastName>
</Person>

Do check out the documentation https://pypi.org/project/dicttoxml/ as you can pass additional arguments to alter the output.

Skully
  • 2,882
  • 3
  • 20
  • 31
Greg
  • 4,468
  • 3
  • 16
  • 26
  • Perfect! I didn't know about "vars()" command, but I did have idea of dictionary to xml. Can't believe this was so hard to find on the internet. – NealWalters Jul 07 '20 at 17:01
  • Ok, now for part 2 - what about nested classes - I posted another question. https://stackoverflow.com/questions/62780800/a-better-xml-parser-for-python-3-with-nested-classes-dicttoxml dicttoxml bombs in that case? – NealWalters Jul 07 '20 at 17:31