8

I am loading MusicXML-files into my program. The problem: There are two “dialects”, timewise and partwise, which have different root-nodes (and a different structure):

<?xml version="1.0" encoding='UTF-8' standalone='no' ?>
<!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 2.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
<score-partwise version="2.0">
    <work>...</work>
    ...
</score-partwise>

and

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE score-timewise PUBLIC "-//Recordare//DTD MusicXML 2.0 Timewise//EN" "http://www.musicxml.org/dtds/timewise.dtd">
<score-timewise version="2.0">
   <work>...</work>
   ...
</score-timewise>

My code for deserializing the partwise score so far is:

using (var fileStream = new FileStream(openFileDialog.FileName, FileMode.Open))
{
    var xmlSerializer = new XmlSerializer(typeof(ScorePartwise));
    var result = (ScorePartwise)xmlSerializer.Deserialize(fileStream);
}

What would be the best way to differentiate between the two dialects?

Jannik Arndt
  • 447
  • 5
  • 13
  • how big are the xml files? – EkoostikMartin May 14 '14 at 19:21
  • That really depends on the piece, an average motet by Palestrina with four voices has about 12000 lines / 300 KB. A whole symphony will definitely have more than that. – Jannik Arndt May 14 '14 at 19:25
  • 1
    Okay, I would load the 3rd line of the file into a string, and then do a `String.IndexOf()` to search for either partwise or timewise, then you know which type of file you are dealing with and can choose the correct serializer. – EkoostikMartin May 14 '14 at 19:27
  • I don't know of many (any) score-timewise files out in nature, so one way most systems do it is just to assume score-partwise. It may seem like a copout answer, but I think it's almost always what people do. – Michael Scott Asato Cuthbert Aug 03 '15 at 21:25

3 Answers3

5

Here's a way to do it by using an XDocument to parse the file, read the root element to determine the type, and read it into your serializer.

var xdoc = XDocument.Load(filePath);
Type type;
if (xdoc.Root.Name.LocalName == "score-partwise")
    type = typeof(ScorePartwise);
else if (xdoc.Root.Name.LocalName == "score-timewise")
    type = typeof(ScoreTimewise);
else
    throw new Exception();
var xmlSerializer = new XmlSerializer(type);
var result = xmlSerializer.Deserialize(xdoc.CreateReader());
Tim S.
  • 55,448
  • 7
  • 96
  • 122
  • 1
    Loading the whole xml document just to check the first line will be somewhat slow considering the file is at minimum 12000 lines. – EkoostikMartin May 14 '14 at 19:48
  • You're about to read the whole file by deserializing it anyway. Reading it -> check first line -> send in-memory file to deserializer can't be too bad (assuming memory usage isn't too bad; I'm assuming the file is in the tens of MB or less, which should be fine). – Tim S. May 14 '14 at 19:49
3

I would create both serializers

var partwiseSerializer = new XmlSerializer(typeof(ScorePartwise));
var timewiseSerializer = new XmlSerializer(typeof(ScoreTimewise));

Assuming that there is only these two I would call CanDeserialize method on one

using (var fileStream = new FileStream(openFileDialog.FileName, FileMode.Open))
{
  using (var xmlReader = XmlReader.Create(filStream))
  {
    if (partwiseSerializer.CanDeserialize(xmlReader))
    {
       var result = partwiseSerializer.Deserialize(xmlReader);
    }
    else
    {
       var result = timewiseSerializer.Deserialize(xmlReader);
    }
  }
}

Obviously this is just an idea how to do it. If there were more options or according to your application design I would use a more sophisticated way to call CanDeserialize, but that method is the key in my opinion:

http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlserializer.candeserialize.aspx

The XmlReader class can be found here:

http://msdn.microsoft.com/en-us/library/System.Xml.XmlReader(v=vs.110).aspx

Santhos
  • 3,348
  • 5
  • 30
  • 48
0

If you're concerned about resource usage:

    internal const string NodeStart = "<Error ";
    public static bool IsErrorDocument(string xml)
    {
        int headerLen = 1;
        if (xml.StartsWith(Constants.XMLHEADER_UTF8))
        {
            headerLen += Constants.XMLHEADER_UTF8.Length;
        }
        else if (xml.StartsWith(Constants.XMLHEADER_UTF16))
        {
            headerLen += Constants.XMLHEADER_UTF16.Length;
        }
        else
        {
            return false;
        }
        if (xml.Length < headerLen + NodeStart.Length)
        {
            return false;
        }
        return xml.Substring(headerLen, NodeStart.Length) == NodeStart;
    }

internal class Constants
{
    public const string XMLHEADER_UTF16 = "<?xml version=\"1.0\" encoding=\"utf-16\"?>";
    public const string XMLHEADER_UTF8 = "<?xml version=\"1.0\" encoding=\"utf-8\"?>";
}
fartwhif
  • 321
  • 3
  • 12