17

I'm parsing XML in python by ElementTree

import xml.etree.ElementTree as ET 
tree = ET.parse('try.xml')
root = tree.getroot()

I wish to parse all the 'xml' files in a given directory. The user should enter only the directory name and I should be able to loop through all the files in directory and parse them one by one. Can someone tell me the approach. I'm using Linux.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Abhishek
  • 1,717
  • 6
  • 22
  • 39
  • all your files are present in same folder or they in folders inside of folder? – namit Mar 28 '13 at 10:31
  • 2
    you can use `glob` like `glob.glob('*.xml')`, this will return the list of xml files, and parse accordingly – avasal Mar 28 '13 at 10:34
  • They are in the same folder – Abhishek Mar 28 '13 at 10:37
  • @Abhishek, if they are in the same folder with your code: for filename in os.listdir(path): if not filename.endswith('.xml'): continue #fullname = os.path.join(path, filename) # yeah, omit this line print(filename) – tursunWali Feb 18 '21 at 04:07

2 Answers2

22

Just create a loop over os.listdir():

import xml.etree.ElementTree as ET
import os

path = '/path/to/directory'
for filename in os.listdir(path):
    if not filename.endswith('.xml'): continue
    fullname = os.path.join(path, filename)
    tree = ET.parse(fullname)
freedev
  • 25,946
  • 8
  • 108
  • 125
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0
import os
import xml.etree.ElementTree as ET

def parse_xml(xml_path):
    tree = ET.parse(xml_path)
    root = tree.getroot()
    classname = root.find('.//testcase').get('classname')
    time = float(root.get('time'))
    return classname, time

def main():
    data_folder = 'programming/assignment-1/data/'
    xml_files = [f for f in os.listdir(data_folder) if f.endswith('.xml')]