I have an XML file like this:
<table chgFlag="i" id="II325">
<row chgFlag="i" idVal="1">
<fld id="II325A">
<datVl chgFlag="i">1</datVl>
</fld>
<fld id="II325B">
<datVl chgFlag="i">2001-12-01</datVl>
</fld>
<fld id="II325C">
<datVl chgFlag="i">2006-04-30</datVl>
</fld>
<fld id="II325D">
<datVl chgFlag="i">01</datVl>
</fld>
</row>
<row chgFlag="i" idVal="2">
<fld id="II325A">
<datVl chgFlag="i">2</datVl>
</fld>
<fld id="II325B">
<datVl chgFlag="i">2006-05-01</datVl>
</fld>
<fld id="II325C">
<datVl chgFlag="i">2031-11-30</datVl>
</fld>
<fld id="II325D">
<datVl chgFlag="i">01</datVl>
</fld>
</row>
</table>
If I just put it into read_xml I get something like this:
chgFlag idVal fld
0 i 1 NaN
1 i 2 NaN
It take sthe attributes in each row as columns. I don't want that, I want value of id in fld as the column and the text inside datVl as the value.
Something like this.
I manage to get the result I wanted using this code:
data_dict = xmltodict.parse(ET.tostring(table))
table_list = []
if type(data_dict["table"]["row"]) == list:
for row in data_dict["table"]["row"]:
row_dict = {}
for field in row["fld"]:
row_dict[field["@id"]] = field["datVl"]["#text"]
table_list.append(row_dict)
df = pd.DataFrame(table_list)
I was wondering if there is more a general solution, perhaps setting some parameter inside read_xml?
I might need to scale my current solution, that is why I am asking.