I'm writing a code that re-organizes namespaces in an arbitrary XML, potentially changing their prefixes. That was pretty straightforward until I ran into the xsi:type attribute:
<foo xsi:type="xs:string">...</foo>
If I change the xs prefix of XSD namespace, I have to do the same for this xsi:type value, e.g. into
<foo i:type="x:string">...</foo>
This attribute is well known. However, in general, if I find a code like this:
<foo xmlns:aaa="http://bbb">
<bar name="aaa:123">...</bar>
</foo>
Is there a way to tell that in the "aaa:123" value the "aaa" part refers to "http://bbb" namespace?
I.e. it could be that the name is simply "aaa:123", without any intended reference to the namespace with "aaa" prefix, and the match is accidental.
If it helps, the implementation language is Java.
Update/Solution:
Thanks to the helpful explanations and pointers provided in the answers below, I have modified my code to work by the following rules when it encounters an attribute that has a prefixed value:
- For xsi:type attribute, update the attribute value’s prefix to match the new prefix for http://www.w3.org/2001/XMLSchema.
- If in the current context there IS NO namespace with a matching prefix,
the value is considered literal (not QName) and left as is. - If in the current context there IS a namespace with a matching prefix, we cannot tell if the attribute value is literal or QName, and so the code cancels the processing and leaves the document as is. The document is not modified at all.
For anyone interested, the code is here.
I know the logic can be improved by not touching only the namespaces affected by the ambiguous attributes, but it is Good Enough(tm) for me.