8

Let's say I have a schema with which I want an input document to comply. I load the file according to the schema like this:

// Load the ABC XSD
var schemata = new XmlSchemaSet();
string abcSchema = FooResources.AbcTemplate;
using (var reader = new StringReader(abcSchema))
using (var schemaReader = XmlReader.Create(reader))
{
    schemata.Add(string.Empty, schemaReader);
}

// Load the ABC file itself
var settings = new XmlReaderSettings
{
    CheckCharacters = true,
    CloseInput = false,
    ConformanceLevel = ConformanceLevel.Document,
    IgnoreComments = true,
    Schemas = schemata,
    ValidationType = ValidationType.Schema,
    ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings
};

XDocument inputDoc;
try
{
    using (var docReader = XmlReader.Create(configurationFile, settings))
    {
        inputDoc = XDocument.Load(docReader);
    }
}
catch (XmlSchemaException xsdViolation)
{
    throw new InvalidDataException(".abc file format constraint violated.", xsdViolation);
}

This works fine in detecting trivial errors in the file. However, because the schema is locked to a namespace, a document like the following is invalid, but sneaks through:

<badDoc xmlns="http://Foo/Bar/Bax">
  This is not a valid document; but Schema doesn't catch it
  because of that xmlns in the badDoc element.
</badDoc>

I would like to say that only the namespaces for which I have schemata should pass schema validation.

Billy ONeal
  • 104,103
  • 58
  • 317
  • 552

3 Answers3

2

As stupid as it seems, the thing you want to look at is actually on the XmlReaderSettings object:

settings.ValidationEventHandler += 
    (node, e) => Console.WriteLine("Bad node: {0}", node);
codekaizen
  • 26,990
  • 7
  • 84
  • 140
JerKimball
  • 16,584
  • 3
  • 43
  • 55
  • @codekaizen - hah, fair enough, that's a "better" example, although I did like the implied astonishment of my original :) – JerKimball Feb 26 '13 at 21:54
  • Agreed, but there may be some reason (e.g. not trashing the whole stack and the parse state) for it, though since it is hardly non-astonishing I hope the epithet "stupid" will take all the responsibility for highlighting this twist. – codekaizen Feb 26 '13 at 21:59
  • @codekaizen That is exceedingly diplomatic of you; Respect achieved. :) – JerKimball Feb 26 '13 at 22:18
  • This still seems to let the invalid document pass. But at least there's a notification. Unfortunately this causes valid cases to fail; for instance, the namespace `xml` is defined implicitly in XML, so any valid uses of `xml:space` in the document fail validation this way. – Billy ONeal Mar 01 '13 at 20:23
  • @billy-oneal Oh, there's another flag on the settings where you can tell it to ignore certain namespaces - when I get back to my desk I'll look it up. – JerKimball Mar 01 '13 at 20:26
  • @BillyONeal Hey, try adding this? `ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings | XmlSchemaValidationFlags.AllowXmlAttributes` – JerKimball Mar 01 '13 at 20:59
1

The solution I ended up settling on is to basically check that the root node is in the namespace I expect. If it isn't, then I treat that the same way I treat a true schema validation failure:

// Parse the bits we need out of that file
var rootNode = inputDoc.Root;
if (!rootNode.Name.NamespaceName.Equals(string.Empty, StringComparison.Ordinal))
{
    throw new InvalidDataException(".abc file format namespace did not match.");
}
Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
-1

Set the ReportValidationWarnings flag. See http://msdn.microsoft.com/en-us/library/system.xml.schema.xmlschemavalidationflags.aspx and http://msdn.microsoft.com/en-us/library/system.xml.xmlreadersettings.validationflags.aspx.

John Saunders
  • 160,644
  • 26
  • 247
  • 397