I have a very large xml file produced from an application whose part of tree is as below:
There are several items under 'item' from 0 to 7. These names are always named as numbers it can range from 0 to any number. Each of these items will have multiple items all with same structure as per the above tree. Only item 0 to 7 is variable all other structure remains same. under I have a value <bbmds_questiontype>: which can be Multiple Choice or Matching or Essays.
What I need is to have a list the values of <mat_formattedtext>. ie. the output is supposed to be:
<0>
<bbmds_questiontype>Multiple Choice</bbmds_questiontype>
<mat_formattedtext>This is first question </mat_formattedtext></0>
<1>
<bbmds_questiontype>Multiple Choice</bbmds_questiontype>
<mat_formattedtext>This is second question </mat_formattedtext> </1>
<2>
<bbmds_questiontype>Essay</bbmds_questiontype>
<mat_formattedtext>This is first question </mat_formattedtext> </2>
....
I have tried several solution included xml tree, xmltodict all getting complicated as filters to be applied across different branches of children
import xmltodict
with open("C:/Users/SS/Desktop/moodlexml/00001_questions.dat") as fd:
doc = xmltodict.parse(fd.read())
shortened=doc['questestinterop']['assessment']['section']['item'] # == u'an attribute'
Any advice will be appreciated to proceed further.