0

I have an xml file structered as follow:

<?xml version="1.0" encoding="UTF-8"?>
<TEI>
   A
   <placeName xml:id="ene.0" n="0" key="geonames 644285" ref="http://www.geonames.org/644285">Pralognan</placeName>
   suivre
   <placeName xml:id="ene.3" n="2" subtype="compound" key="osm 2272301" ref="http://www.openstreetmap.org/way/2272301">
      la route entre
      <placeName xml:id="ene.1" n="1" key="osm 178528565" ref="http://www.openstreetmap.org/node/178528565">
         l'hôtel  de la
         <placeName n="0">Vanoise</placeName>
      </placeName>
      et celui du
      <placeName xml:id="ene.2" n="0" key="osm 3379120" ref="http://www.openstreetmap.org/way/3379120">Petit Mont Blanc</placeName>
   </placeName>
</TEI>

And python code to parse it:

import xml.etree.cElementTree as ET
parse_file    = open("file.xml","r")
tree_parse_file = ET.parse(parse_file)
root_parse_file = tree_parse_file.getroot()

for child in root_parse_file: # Child pointing on all sub child of root
    if "ref" in child.attrib.keys():
        #some code...
        for subChild in child: # To point on all of subChild of Child elements, this is line 59 of my code
        print(subChild.attrib['ref'])
        #some code... 

When I want to iterate over this element

<placeName xml:id="ene.3" ...>

to get all nested elements and parse their attributes, I get the following error on this line: print(subChild.attrib['ref']) error:

Traceback (most recent call last):
  File "./generate_long_lat2.py", line 59, in <module>
    print(subChild.attrib['ref'])
KeyError: 'ref'

and the attrib ref exist in sub child of the element

<placeName xml:id="ene.1" ...>

My question is how can I iterate over all nested sub child of root element ?

marOne
  • 129
  • 2
  • 13

1 Answers1

1

To iterate over attributes for a specific tag you can use this code (tag placeName that contains id):

from lxml import etree

tree = etree.parse("file.xml")

for attributes in tree.xpath("//placeName[(@xml:id)]"):
    for name, value in attributes.items():
        print(f'{name} = {value}')

Output:

{http://www.w3.org/XML/1998/namespace}id = ene.0
n = 0
key = geonames 644285
ref = http://www.geonames.org/644285
{http://www.w3.org/XML/1998/namespace}id = ene.3
n = 2
subtype = compound
key = osm 2272301
ref = http://www.openstreetmap.org/way/2272301
{http://www.w3.org/XML/1998/namespace}id = ene.1
n = 1
key = osm 178528565
ref = http://www.openstreetmap.org/node/178528565
{http://www.w3.org/XML/1998/namespace}id = ene.2
n = 0
key = osm 3379120
ref = http://www.openstreetmap.org/way/3379120

Documentation here -> https://lxml.de/tutorial.html#elements-carry-attributes-as-a-dict

earw
  • 41
  • 4