0

I am trying to use elementTree to get at information in an xml response.

The response xmlresponse.xml looks like:

<result xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://somewhere.co.uk/">
    <count>1</count>
    <pageInformation>
        <offset>0</offset>
        <size>10</size>
    </pageInformation>
    <items>
        <person uuid="1">
            <name>
                <firstName>John</firstName>
                <lastName>Doe</lastName>
            </name>
            <ManagedByRelations>
                <managedByRelation Id="1234">
                    <manager uuid="2">
                        <name formatted="false">
                            <text>Jane Doe</text>
                        </name>
                    </manager>
                    <managementPercentage>30</managementPercentage>
                    <period>
                        <startDate>2019-09-26</startDate>
                    </period>

                </managedByRelation>
                <managedByRelation Id="1234">
                    <manager uuid="3">
                        <name formatted="false">
                            <text>Joe Bloggs</text>
                        </name>
                    </manager>
                    <managementPercentage>70</managementPercentage>
                    <period>
                        <startDate>2019-09-26</startDate>
                    </period>
                </managedByRelation>
            </ManagedByRelations>
            <fte>0.0</fte>
        </person>
    </items>
</result>

How do I get the information contained using elementTree, for example how can I retrieve the list of managers names, ids and start dates?

If I do:

from xml.etree.ElementTree import Element, ParseError, fromstring, tostring, parse

tree = parse('xmlresponse.xml')
root = tree.getroot()

for manager in root.findall('managedByRelation'):
    print(manager)

The findall() doesnt return anything. I know i could do a list(root.iter()) to iterate through everything in the tree, but I want to know why root.findall() isn't working as I expect?

abinitio
  • 609
  • 6
  • 20
  • 1
    Since `` is not an immediate child of the document root, you need to provide an XPath expression that searches for it _anywhere_ in the tree, eg. `root.findall('//managedByRelation')` – Mathias R. Jessen May 26 '23 at 12:03
  • Ah ok great - thanks. And what is the best way to pull out the uuid, and start date etc? something like `for manager_rel in root.findall('.//managedByRelation'): for manager in manager_rel.findall('.//manager'): print("Manager",manager.attrib)` ? Or is there a better way? – abinitio May 26 '23 at 12:19

1 Answers1

1

You can iter() into the find branches:

import xml.etree.ElementTree as ET

tree = ET.parse('xmlresponse.xml')
root = tree.getroot()

for manRel in root.findall('.//managedByRelation'):
    for manager in manRel.iter('manager'):
        uuid = manager.get('uuid')
    for name in manRel.iter('text'):
        full_name = name.text
    for managementPercentage in manRel.iter('managementPercentage'):
        managementPercentage = managementPercentage.text
    for startdate in manRel.iter('startDate'):
        date = startdate.text
    print(f"{uuid} {full_name:^20} {managementPercentage:^10} {date}")

Output:

2       Jane Doe           30     2019-09-26
3      Joe Bloggs          70     2019-09-26
Hermann12
  • 1,709
  • 2
  • 5
  • 14