(Please bear a moment with the following little story, before I ask my question!)
I have some SVG element generated by MathJax that, after generation, looks like so (as found in the element inspector):
<svg xmlns:xlink="http://www.w3.org/1999/xlink" width="6.768ex" height="2.468ex" viewBox="0 -825.2 2914.1 1062.4">
<defs>...</defs>
<g>...</g>
</svg>
When I try to display this SVG on its own in chrome or safari, the browser displays the following error message:
This XML file does not appear to have any style information associated with it. The document tree is shown below. [...]
After some experimentation, I found that the culprit is a missing 'xmlns' tag. (I guess MathJax puts another SVG higher up in the page that has this tag, so inside the web page, it doesn't need to be repeated a second time. Or something.) Namely, changing the opening <svg>
tag to this allows the SVG to be displayed on its own by the browser:
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="6.768ex" height="2.468ex" viewBox="0 -825.2 2914.1 1062.4">
<defs>...</defs>
<g>...</g>
</svg>
(Note the new xmlns
attribute.)
OK. Good.
Now I want to automate this task of adding the missing xmlns
tag. I want to use the python lxml utility for this.
Unfortunately (finally coming to my question!), lxml seems to hide all attributes that start with 'xmlns' and I don't know why. While it allows me to add the 'xmlns' attribute (e.g., by doing
root.attrib['xmlns'] = "http://www.w3.org/2000/svg"
where root
is the root <svg>
tag of the document), I cannot test if the 'xmlns' attribute is already there or not, and actually if I run the script twice on the same file this results in two separate xmlns
tags being added, which in turn causes lxml to complain and crash.
So: (i) why is lxml hiding certain attributes from me, and (ii) regardless of that how can I add the xmlns
tag only if it isn't there already? (Of course I could manually parse the file, but I'd like a self-contained solution using lxml.)