0

How do I get from this XML file using lxml library in Python?

I couldn't find the proper XPATH to get the tags.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
hans-t
  • 3,093
  • 8
  • 33
  • 39

1 Answers1

2

You need to handle namespaces (and an empty one too):

namespaces = {
  "dc":"http://purl.org/dc/elements/1.1/",
  "cc": "http://creativecommons.org/ns#",
  "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
  "svg": "http://www.w3.org/2000/svg",
  "myns": "http://www.w3.org/2000/svg"
}

tree = ET.fromstring(data)
for rect in tree.xpath("//myns:rect", namespaces=namespaces):
    print rect.attrib.get("id")

where data is an XML string you've provided.

For testing purposes it just prints rect element id attributes:

rect3347
rect3349
rect3351
rect3351-1
rect3351-17
rect3351-1-4
rect3397
rect3399
rect3401
rect3403
Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195