I'm parsing XML files and I have a follow-on question from here. From the below XML field:
<enrollment type="Anticipated">30</enrollment>
and I would like to pull out the word anticipated, and the number. In the files that I have, 'enrollment type'/'enrollment' will remain stable between files, but 'anticipated' will not (e.g. sometimes it says 'actual' or something else) and the number will not remain stable.
The code that I tried:
from lxml import etree
import sys
import glob
list_to_get = ['enrollment']
list_of_files = glob.glob('*xml')
for each_file in list_of_files:
# try:
tree = etree.parse(each_file)
root = tree.getroot()
for node in root.xpath("//" + 'enrollment'):
for e in node.xpath('descendant-or-self::*[not(*)]'):
if e.attrib:
print e.attrib
print e.find('type')
print e.find('.//type')
print e.attrib['type']
print e.find(e.attrib['type']).text
using this method, I can pull out the type (e.g. anticipated/actual), but I can't find any way to pull out the number. If someone had an idea of the print line I should use, I would appreciate it.
I did look at some similar questions (e.g. here) but their suggestions don't seem to work for me.