I have an XML with many levels. Each level may have namespace attached to it. I want to find
a specific element whose name I know, but not its namespace. For example:
my_file.xml
<?xml version="1.0" encoding="UTF-8"?>
<data xmlns="aaa:bbb:ccc:ddd:eee">
<country name="Liechtenstein" xmlns="aaa:bbb:ccc:liechtenstein:eee">
<rank updated="yes">2</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore" xmlns="aaa:bbb:ccc:singapore:eee">
<continent>Asia</continent>
<holidays>
<christmas>Yes</christmas>
</holidays>
<rank updated="yes">5</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama" xmlns="aaa:bbb:ccc:panama:eee">
<rank updated="yes">69</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
import lxml.etree as etree
tree = etree.parse('my_file.xml')
root = tree.getroot()
cntry_node = root.find('.//country')
The find
above does not return anything to cntry_node
. In my real data, the levels are deeper than this example. The lxml document talks about namespace. When I do this:
root.nsmap
I see this:
{None: 'aaa:bbb:ccc:ddd:eee'}
If someone could explain how to access the full nsmap
and/or how to use it to find
a specific element? Thanks very much.