0
<?xml version="1.0" encoding="UTF-8"?>
<studentData xmlns="http://www.myschool.com/schmea/studentData" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:schemaLocation="http://www.myschool.com/schmea/studentData Studentdata.xsd">
   <stuRec>
      <as>
         <sourceSys>BBC</sourceSys>
         <acctDt>2023-04-04</acctDt>
      </as>
      <stats>
         <ss>
            <prov>AB</prov>
            <cono>1</cono>
         </ss>
      </stats>
   </stuRec>
   <stuRec>
      <as>
         <sourceSys>RCD</sourceSys>
         <acctDt>2023-05-14</acctDt>
      </as>
      <stats>
         <ss>
            <prov>ON</prov>
            <cono>2</cono>
         </ss>
      </stats>
   </stuRec>
</studentData>
import xml.etree.ElementTree as ET
    mytree=ET.parse("/Users/user/student.xml")
    myroot=mytree.getroot()
    tag=myroot.tag
    print(tag)
    #attr=myroot.attrib
    #print(attr)

for p in myroot.findall('.//studentData'):
    acctDt=p.find('acctDt').text

**My XML file (student.xml) looks like above xml file: **When I run the python code I can print root tag and attribute but I get nothing from the loop, however, I want to get acctDt and prov:

user@star ~ % python -u "/Users/user/student.py"
{http://www.myschool.com/schmea/studentData}studentData
{'{http://www.w3.org/2000/10/XMLSchema-instance}schemaLocation': 'http://www.myschool.com/schmea/studentData Studentdata.xsd'}
user@star ~ % 
Hai Vu
  • 37,849
  • 11
  • 66
  • 93
Trango
  • 21
  • 4
  • This will help you https://www.youtube.com/watch?v=qAaVKCi3wu0&list=PUsFz0IGS9qFcwrh7a91juPg&index=12 – Dejene T. Jun 26 '23 at 05:07
  • You are not taking the `http://www.myschool.com/schmea/studentData` namespace into account. See https://stackoverflow.com/a/20447459/407651 – mzjn Jun 26 '23 at 06:59

3 Answers3

2

You should adjust your loop, because your xml contain a namespace. Do something like:

ns = {'': 'http://www.myschool.com/schmea/studentData'}
for node in myroot.findall('.//acctDt', ns):
    print(node.text)

Compare Parsing XML with Namespaces

Hermann12
  • 1,709
  • 2
  • 5
  • 14
  • This worked. Thanks. According to the above solution I will have to create loop for each column. But how can I print all the columns like the accDt in one row? sourceSys, acctDt, prov, cono – Trango Jun 28 '23 at 03:51
  • @TahirBelutsch, see below. – Hermann12 Jun 28 '23 at 18:46
0

I hope, this will work for your solution

from lxml import etree
tree = etree.parse('./xml_schema_info.xml')
root = tree.getroot()
ele_sets = set()
for ele in root.xpath('.//*'):
    ele_sets.add(ele.tag)
print(f'elements: \n{ele_sets}\nTotal: {len(ele_sets)}')
acctDt = '{http://www.myschool.com/schmea/studentData}acctDt'
for ele in root.iter(acctDt):
    print(f'acctDt: {ele.text}')
prov = '{http://www.myschool.com/schmea/studentData}prov'
for ele in root.iter(prov):
    print(f'prov: {ele.text}')
Muhammad Ali
  • 444
  • 7
0

For you extended question:

import xml.etree.ElementTree as ET
from io import StringIO

xml_str="""<?xml version="1.0" encoding="UTF-8"?>
<studentData xmlns="http://www.myschool.com/schmea/studentData" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:schemaLocation="http://www.myschool.com/schmea/studentData Studentdata.xsd">
   <stuRec>
      <as>
         <sourceSys>BBC</sourceSys>
         <acctDt>2023-04-04</acctDt>
      </as>
      <stats>
         <ss>
            <prov>AB</prov>
            <cono>1</cono>
         </ss>
      </stats>
   </stuRec>
   <stuRec>
      <as>
         <sourceSys>RCD</sourceSys>
         <acctDt>2023-05-14</acctDt>
      </as>
      <stats>
         <ss>
            <prov>ON</prov>
            <cono>2</cono>
         </ss>
      </stats>
   </stuRec>
</studentData>"""

f = StringIO(xml_str)

tree = ET.parse(f)
root = tree.getroot()

ns = {'': 'http://www.myschool.com/schmea/studentData'}

for strRec in root.findall('.//stuRec', ns):
    sourceSys = strRec.find('.//sourceSys', ns).text
    acctDt = strRec.find('.//acctDt', ns).text
    prov = strRec.find('.//prov', ns).text
    cono = strRec.find('.//cono', ns).text
    
    print(f"{sourceSys:<3},{acctDt:>15},{prov:>6},{cono:>5}")

Output:

BBC,     2023-04-04,    AB,    1
RCD,     2023-05-14,    ON,    2
Hermann12
  • 1,709
  • 2
  • 5
  • 14
  • Excellent! this is what I needed. Thanks very much. Any advice, how to get expert in Python and XML, any prefered Tutorial / source you would like to recommend. – Trango Jun 29 '23 at 02:00