2

EDIT: I found a way to make it work. It turns out I had an elem.clear() call that I didn't show in the code below. I apologize for that. I modified it so you can see how it was. It turns out that if I move that call inside the if statement the problem went away. But I still don't understand how the clear was called before the if statement was finished.

I have an XML file that sort of looks like this:

<alarm> <alarm_id>   127688705 </alarm_id> <site> 1     </site> <event_time> 14/08/31 00:01:00    </event_time> <cease_time> 14/08/31 00:07:00    </cease_time> <problem_text>
    Something went wrong                                     </problem_text> </alarm>

I know it doesn't have the proper styling but that's how my script receives it so I thought of giving you guys the whole picture. The file basically has hundreds of <alarm> elements under a <root> element.

What I want to do is is parse the file with iterparse and get all the text information from the child elements of <alarm>. My script so far looks like this:

import xml.etree.cElementTree as etree

try:
    sourcefile = open('file.xml')
except IOError:
    print('Cannot open ', sourcefile)
    return -1

for event, elem in etree.iterparse(sourcefile):
    if elem.tag == 'alarm':
        print("event:", event)
        for child in elem:
            print(child.tag, child.text)
    elem.clear()

But I get None as a result from child.text. Here's the output I get when I run the script:

[big@bang src]$ ./parse_xml.py
event: end
alarm_id None
site None
event_time None
cease_time None
problem_text None

Can you guys give me a hand with this?

eliasvc
  • 21
  • 4

2 Answers2

0

Remove the return statement and this code works fine.

0

I had this same problem - my root element had text and attributes but child elements would have no text or elements. My original code was:

    for _, element in ET.iterparse(file_in):
        el = shape_element(element)
        if el:
            data.append(el)
        element.clear()

the code that works and does not clear child elements text is

    for _, element in ET.iterparse(file_in):
        el = shape_element(element)
        if el:
            data.append(el)
            element.clear()