0

I've been given conflicting information on how an XML Schema 1.1 based XML validator validates element text values that are supposed to conform to the xs:QName primitive datatype when they are not prefixed.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="foo:uri"
           targetNamespace="foo:uri"
           elementFormDefault="qualified"
           attributeFormDefault="unqualified"
           xml:lang="en"
           version="1.1">
  <xs:element name="bad-attribute" type="xs:QName"/>
</xs:schema>

Let's say the above schema definition matches the following element in an instance document:

<?xml version="1.0" encoding="UTF-8"?>
<bad-attribute xmlns="foo:uri">foo</bad-attribute>

Does lexical value foo get mapped to {foo:uri, foo} or just {null, foo} (null namespace, no namespace, absent namespace) in this primitive datatype's value space? Note: my notation defines a {namespace-name, local-part} tuple.

It seems like the first option could be what the specification prescribes, but I'm not sure.

When QNames appear in an XML context, the bindings to be used in the ·lexical mapping· are those in the [in-scope namespaces] property of the relevant element.

[in-scope namespaces] An unordered set of namespace information items, one for each of the namespaces in effect for this element. This set always contains an item with the prefix xml which is implicitly bound to the namespace name http://www.w3.org/XML/1998/namespace. It does not contain an item with the prefix xmlns (used for declaring namespaces), since an application can never encounter an element or attribute with that prefix. The set will include namespace items corresponding to all of the members of [namespace attributes], except for any representing declarations of the form xmlns="" or xmlns:name="", which do not declare a namespace but rather undeclare the default namespace and prefixes. When resolving the prefixes of qualified names this property should be used in preference to the [namespace attributes] property; they may be inconsistent in the case of Synthetic Infosets.

[namespace attributes] An unordered set of attribute information items, one for each of the namespace declarations (specified or defaulted from the DTD) of this element. Declarations of the form xmlns="" and xmlns:name="", which undeclare the default namespace and prefixes respectively, count as namespace declarations. Prefix undeclaration was added in Namespaces in XML 1.1. By definition, all namespace attributes (including those named xmlns, whose [prefix] property has no value) have a namespace URI of http://www.w3.org/2000/xmlns/. If the element has no namespace declarations, this set has no members.

Would:

<?xml version="1.0" encoding="UTF-8"?>
<foo:bad-attribute xmlns:foo="foo:uri">foo</foo:bad-attribute>

result in this lexical space value being mapped to {null, foo} in value space of this datatype?

I would appreciate it if you could back your answer with relevant specification citations.

predi
  • 5,528
  • 32
  • 60

1 Answers1

1

I think that the intent is that it should map to {null, foo}, but I can't find anything in the spec that says that unambiguously.

UPDATE

I believe that answer was incorrect and that it should map to {foo:uri, foo} (assuming the typo in the question is corrected). But I agree that it's not 100% clear from the spec.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Do you perhaps recall a discussion somewhere where this intention would be evident? This corner case is probably not relevant to XSD 1.1 validator implementations, hence vague specification text, but I've come across a case, where semantics of an instance document may change depending on this intent. – predi Jun 01 '23 at 04:51
  • Having implemented an XSD processor myself, I seem to recall asking the experts this question and getting this answer. It is highly relevant to validators, e.g. when enforcing uniqueness constraints, or when interpreting the `xsi:type` attribute. – Michael Kay Jun 01 '23 at 15:35
  • No, sorry, I've just checked, and Saxon maps it to `(uri:foo, foo)`. (There's a typo in the question, by the way, `uri:foo` vs `foo:uri`). I guess the explanation is that [in-scope namespaces] does include a binding of "" to "uri:foo", and this binding is used in the lexical mapping; also the statement that "If the host language does not specify otherwise, unqualified names are bound to the default namespace." – Michael Kay Jun 01 '23 at 16:03
  • Yeah, that was a typo (now corrected). You are right. It is right there in the spec. The part I conveniently did not quote... Plus using Saxon as a reference does carry some weight, I'd say. Thank you for taking the time to answer this. – predi Jun 01 '23 at 18:53