Given the following XML
<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<id>1</id>
<title>Example XML</title>
<published>2021-12-15T00:00:00Z</published>
<updated>2022-01-06T12:44:47Z</updated>
<content type="application/xml">
<articleDoc xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" chemaVersion="1.8" xml:lang="en">
<articleDocHead>
<itemInfo/>
</articleDocHead>
</articleDoc>
</content>
</entry>
How can I get the value of the xml:lang attribute in entry/content/articleDoc attribute? I've checked the Python Docs but it unfortunately doesn't cover attributes with namespaces. The solution if found by manually writing the namespace in front of the attribute-name as a dictionary key seems wrong. I'm working with Python 3.9.9.
Here's my code so far:
import xml.etree.cElementTree as tree
xml = """<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<id>1</id>
<title>Example XML</title>
<published>2021-12-15T00:00:00Z</published>
<updated>2022-01-06T12:44:47Z</updated>
<content type="application/xml">
<articleDoc xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" schemaVersion="1.8" xml:lang="en">
<articleDocHead>
<itemInfo/>
</articleDocHead>
</articleDoc>
</content>
</entry>"""
ns = {'nitf': 'http://iptc.org/std/NITF/2006-10-18/',
'w3': 'http://www.w3.org/2005/Atom',
'xml': 'http://www.w3.org/XML/1998/namespace'}
root = tree.fromstring(xml)
id = root.find("w3:id", ns).text # works
print(id)
type_attribute = root.find("w3:content", ns).attrib['type'] # works
print(type_attribute)
#language = root.find("w3:content/articleDoc/articleDocHeader[xml:lang']", ns) # doesn't work
language = root.find("w3:content/articleDoc", ns).attrib['{http://www.w3.org/XML/1998/namespace}lang'] # works, but seems wrong
print(language)
Any help is appreciated. Thanks a lot!