3

I'm trying to process an XML document and determine which namespaces are defined in it but I'm having trouble getting consistent results from XmlNamespaceManager.HasNamespace. As it's reading through the document HasNamespace will return false even though it's still declared and in scope.

Sample code:

    var ctx = new XmlParserContext(null, new XmlNamespaceManager(new NameTable()), null, XmlSpace.None);
    var set = new XmlReaderSettings() { IgnoreComments = true, IgnoreProcessingInstructions = true, IgnoreWhitespace = true };

    using (var xml = new StringReader(
        "<?xml version=\"1.0\" encoding=\"utf-8\"?>" +
        "<rdf:RDF " +
        "  xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"> " +
        "  <rdf:Description rdf:about=\"x\" /> " +
        "</rdf:RDF>"))
    using (var rdr = XmlReader.Create(xml, set, ctx))
    {
        rdr.MoveToContent();

        Console.WriteLine(rdr.Name);
        Console.WriteLine(rdr.LookupNamespace("rdf"));
        Console.WriteLine(ctx.NamespaceManager.HasNamespace("rdf"));    // True
        
        rdr.Read();
        
        Console.WriteLine(rdr.Name);
        Console.WriteLine(rdr.LookupNamespace("rdf"));
        Console.WriteLine(ctx.NamespaceManager.HasNamespace("rdf"));    // False
        
        rdr.Read();
        
        Console.WriteLine(rdr.Name);
        Console.WriteLine(rdr.LookupNamespace("rdf"));
        Console.WriteLine(ctx.NamespaceManager.HasNamespace("rdf"));    // True
    }

Fiddle

mford
  • 313
  • 2
  • 8

1 Answers1

3

As the reader enters each new element, it will call PushScope on the namespace manager. Once it leaves the element (via the end of a self-closing tag or the corresponding end tag), it calls PopScope.

HasNamespace, unlike some other members of the namespace manager, only answers the question for the current scope.

Gets a value indicating whether the supplied prefix has a namespace defined for the current pushed scope.

(My emphasis)

In general, you shouldn't be working with the namespace prefixes all that much, unless you're actually performing parsing yourself1 rather than leveraging the existing tools. It's the combination of the element name (RDF) and the namespace (http://www.w3.org/1999/02/22-rdf-syntax-ns#\) that uniquely defines the type of the element - the prefix can be changed (provided it's done consistently throughout the scope of the document in which it is valid) without changing the information content of the XML.


You can see this for yourself if you create this class:

class LoggingNamespaceManager : XmlNamespaceManager
{
    public LoggingNamespaceManager (XmlNameTable table) : base(table)
    {

    }

    public override void PushScope()
    {
        Console.WriteLine("Push");
        base.PushScope();
    }

    public override bool PopScope()
    {
        Console.WriteLine("Pop");
        return base.PopScope();
    }
}

And instantiate it rather than XmlNamespaceManager in the first line of your sample.


1Please don't though. There are enough brittle "XML" parsers out there already which are built on invalid assumptions about XML. Use the tools provided in the Framework, as you're currently doing.

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
  • I had thought that any namespaces defined in the parent would automatically flow to the children but I guess that isn't the case. As for the prefixes, fair enough, but I'm working with a standard that defines the prefixes that must be available. – mford Jan 15 '21 at 15:55