EDIT: I found a way to make it work. It turns out I had an elem.clear()
call that I didn't show in the code below. I apologize for that. I modified it so you can see how it was. It turns out that if I move that call inside the if statement the problem went away. But I still don't understand how the clear was called before the if statement was finished.
I have an XML file that sort of looks like this:
<alarm> <alarm_id> 127688705 </alarm_id> <site> 1 </site> <event_time> 14/08/31 00:01:00 </event_time> <cease_time> 14/08/31 00:07:00 </cease_time> <problem_text>
Something went wrong </problem_text> </alarm>
I know it doesn't have the proper styling but that's how my script receives it so I thought of giving you guys the whole picture. The file basically has hundreds of <alarm>
elements under a <root>
element.
What I want to do is is parse the file with iterparse and get all the text information from the child elements of <alarm>
. My script so far looks like this:
import xml.etree.cElementTree as etree
try:
sourcefile = open('file.xml')
except IOError:
print('Cannot open ', sourcefile)
return -1
for event, elem in etree.iterparse(sourcefile):
if elem.tag == 'alarm':
print("event:", event)
for child in elem:
print(child.tag, child.text)
elem.clear()
But I get None
as a result from child.text
. Here's the output I get when I run the script:
[big@bang src]$ ./parse_xml.py
event: end
alarm_id None
site None
event_time None
cease_time None
problem_text None
Can you guys give me a hand with this?