-2

I want to extract all the text in a XML-File with Python.

Is there any possibility?

I heard that it is possible with using xml.etree.ElementTree

For example:

<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

I just want to extract: 1 2008 141100 4 2011 and so on...

1 Answers1

0

Hope this helps

from xml-etree.ElementTree import parse

tree = parse('data.xml') #extract the file (change the name to your file name)

for e in tree.findall('year') #searches the content of all the tags 'year' 
    print(e.text)

You can also store the value in a dict with the key equal to the name of the country and the value equal to a list of all the attributes

PMM
  • 366
  • 1
  • 10
  • Does not work. `open` should be `parse`. And there are no `title` elements in the XML document in the question. – mzjn Aug 07 '19 at 11:44
  • 'title' was just a random name i put. In your case you can put rank, year, neighbor ecc.. – PMM Aug 07 '19 at 11:46
  • sorry i took this from an old script i made a while ago. – PMM Aug 07 '19 at 12:17
  • The code still has at least two syntax errors. Why don't you test it before posting? – mzjn Aug 07 '19 at 12:26