1

I want to parse a xml file and save it as a txt file.

My XML-File looks like follows:

I am just interested in the attribute class inside INSTANCE

<ADOXML adoversion="Version 5.1" username="Admin" database="adoxxdb" time="09:49" date="18.09.2019" version="3.1">
   <MODELS>
      <MODEL version="" applib="ADOxx 1.5 Dynamic Experimentation Library" libtype="bp" modeltype="DSML4VPL" name="DSML4VPL - new (2)" id="mod.29201">
      <INSTANCE name="Online entry point-42200" id="obj.42200" class="Online entry point">
          <ATTRIBUTE name="Position" type="STRING">NODE x:2cm y:4cm index:1</ATTRIBUTE>
          <ATTRIBUTE name="External tool coupling" type="STRING"/>
      </INSTANCE>
     <INSTANCE name="Interact-42206" id="obj.42206" class="**Interact**">
          <ATTRIBUTE name="Position" type="STRING">NODE x:7.5cm y:4cm index:2</ATTRIBUTE>
          <ATTRIBUTE name="External tool coupling" type="STRING"/>
          <ATTRIBUTE name="Comment" type="STRING"/>
          <ATTRIBUTE name="Description" type="STRING"/>
          <ATTRIBUTE name="Open Questions" type="STRING"/>
      </INSTANCE>
     <INSTANCE name="Select-42210" id="obj.42210" class="**Select**">
      <ATTRIBUTE name="Position" type="STRING">NODE x:12.5cm y:4cm index:4</ATTRIBUTE>
          <ATTRIBUTE name="External tool coupling" type="STRING"/>
          <ATTRIBUTE name="Comment" type="STRING"/>
          <ATTRIBUTE name="Description" type="STRING"/>
          <ATTRIBUTE name="Open questions" type="STRING"/>
    </INSTANCE>


    </MODEL>
  </MODELS>
</ADOXML>

I just want to write every class like "Online entry point" or "Interact" inside a txt.file

The output should just look like

Klassen
Online entry point
Interact
Select

My code looks like follows:

import xml.etree.ElementTree as ET
tree=ET.parse("test1.xml")
root=tree.getroot()    
with open("file3.txt","w")as f:
        f.write("Class\n")
    for xclass in root.findall("MODEL"):
        Klasse=xclass.find("INSTANCE").get("class")
        line_to_write=Klasse
        with open("file3.txt","a") as f:
            f.write(line_to_write)

However, I do not now what I am making wrong, there is no error message, just a txt.file with Class in it.

mzjn
  • 48,958
  • 13
  • 128
  • 248
3razOr1993
  • 59
  • 4

3 Answers3

1

The problem is that findall only searches the immediate descendants of an element if it is given a tag name : ElementTree findall() returning empty list.

You can simply go through all elements tags and select the attibutes you're looking for.

import xml.etree.ElementTree as ET
tree=ET.parse("test1.xml")
root=tree.getroot() 

# Get "class" attribute of "INSTANCE" tags.
line_to_write = []
for xclass in root.iter("INSTANCE"):
    line_to_write.append(xclass.get("class"))

# Writing to a file with space as delimiter
with open("file3.txt","w")as f:
    f.write("Class\n")
    f.write(" ".join([str(word) for word in line_to_write]))
1

I think you're close.

A couple of things I'm not sure of...

  • Why iterate over MODEL? Could there be more than one? Should each MODEL be a separate text file?
  • Why are you trying to open the text file a second time?

Based on your current examples, you should be able to use findall(".//INSTANCE") to iterate over each INSTANCE element.

Here's an example that produces your requested output with your supplied example...

import xml.etree.ElementTree as ET

tree = ET.parse("test1.xml")

with open("file3.txt", "w")as f:
    f.write("Class\n")
    for instance in tree.findall(".//INSTANCE"):
        f.write(f"{instance.get('class')}\n")
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
0

I still writing and thinking about the same problem but now I have a different goal. It is howere nearly the same problem, but I cannot get it right.

My xml file looks like this:

<ADOXML adoversion="Version 5.1" username="Admin" database="adoxxdb" time="10:39" date="21.10.2019" version="3.1">
<MODELS>
<MODEL version="" applib="ADOxx 1.5 Dynamic Experimentation Library" libtype="bp" modeltype="Ressource Model" name="Ressource Model - new" id="mod.47204">
<MODELATTRIBUTES>
<INSTANCE name="Collection of written documents-49041" id="obj.49041" class="Collection of written documents">
<ATTRIBUTE name="Position" type="STRING">NODE x:7cm y:1.5cm index:1</ATTRIBUTE>
<ATTRIBUTE name="External tool coupling" type="STRING"/>
<ATTRIBUTE name="Comment" type="STRING"/>
<ATTRIBUTE name="Description" type="STRING"/>
<ATTRIBUTE name="Referenced Document" type="PROGRAMCALL">ITEM "" param:""</ATTRIBUTE>
<ATTRIBUTE name="Display file name" type="INTEGER">0</ATTRIBUTE>
<ATTRIBUTE name="Type of written documents" type="ENUMERATIONLIST">Demonstration;Portfolio</ATTRIBUTE>
</INSTANCE>
</MODEL>
</MODELS>
</ADOXML

I am just interested in the attribute type inside the class ATTRIBUTE. I want just to display the values Demonstration and Portfolio in a text file

My code:

import xml.etree.ElementTree as ET
tree=ET.parse(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Resource.xml")
with open(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Output88.txt", "w")as f:
    f.write("Class\n")
    for instance in tree.findall(".//ATTRIBUTE"):
        f.write(f"{instance.get('type')}\n")

I know it is the same problem, however, I do not find the right way or the right path to get a text file which just contains the words Demonstration and Portfolio

3razOr1993
  • 59
  • 4