0

I have an XML file of the structure as follows

<article>
<body>
text1
<collectionlink>
text2
</collectionlink>
text3
</body>
</article>

I used iterparser for parsing. But its not printing the data correctly. I am adding code here.

from xml.etree.ElementTree import iterparse,dump

def main():
    fp=open("sam.xml",'r')
    tree_dict = create_dict_tree_elements(fp)

def create_dict_tree_elements(fp):
    depth=0
    for event,node in iterparse(fp, ['start', 'end', 'start-ns', 'end-ns']):
        if event=='start-ns' or event=='end-ns':
            continue
        if (event == 'start' and depth == 0):
            print node.text
            depth += 1
            continue        

        if (event == 'start' and depth >0 ):
            print node.text
            depth+=1

        if(event =='end' ):
            depth-=1



if __name__ == '__main__':
    main()

My expected output:

text1
text2
text3

Output am getting

text1
text2

1 Answers1

0

In terms of ElementTree node.text is the text between the opening tag and the next tag. The text between the closing tag and the next tag can be found in node.tail.

newtover
  • 31,286
  • 11
  • 84
  • 89