1

This question is very similar to this one but with a small twist.

I am trying to split a object representing xml to multiple xml objects based on number of tag elements allowed per object. I'm trying to get the best possible approach to this. Any help on this will be great... Sample example on what I am trying to do...

xml source representation:

 <?xml version="1.0" encoding="utf-8"?>
<DocType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:pmlcore="urn:autoid:specification:interchange:xml:schema:1">
    <id>tbd</id>
    <Observation>
        <Command>c1</Command>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Data>...</Data>
    </Observation>
    <Observation>
        <Command>c2</Command>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Data>...</Data>
    </Observation>
</DocType>

Desired output given that number of allowed 'Tag' elements per document is ... 3

xml 1:

<?xml version="1.0" encoding="utf-8"?>
<DocType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:pmlcore="urn:autoid:specification:interchange:xml:schema:1">
    <id>tbd</id>
    <Observation>
        <Command>c1</Command>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Data>...</Data>
    </Observation>
</DocType>

xml 2:

<?xml version="1.0" encoding="utf-8"?>
<DocType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:pmlcore="urn:autoid:specification:interchange:xml:schema:1">
    <id>tbd</id>
    <Observation>
        <Command>c1</Command>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Data>...</Data>
    </Observation>
    <Observation>
        <Command>c2</Command>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Data>...</Data>
    </Observation>
</DocType>

I believe by now you got idea what's the requirement but I'll continue:

xml 3:

<?xml version="1.0" encoding="utf-8"?>
<DocType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:pmlcore="urn:autoid:specification:interchange:xml:schema:1">
    <id>tbd</id>
    <Observation>
        <Command>c2</Command>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Data>...</Data>
    </Observation>
</DocType>

xml 4:

<?xml version="1.0" encoding="utf-8"?>
<DocType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:pmlcore="urn:autoid:specification:interchange:xml:schema:1">
    <id>tbd</id>
    <Observation>
        <Command>c2</Command>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Tag>
            <id>....</id>
            <Data>...</Data>
        </Tag>
        <Data>...</Data>
    </Observation>
</DocType>
Community
  • 1
  • 1
krul
  • 2,211
  • 4
  • 30
  • 49
  • I really don't understand what are you asking. You mention that you have a desired output of max three elements, yet the source have only two and you are outputting one. – Yeray Cabello Nov 01 '16 at 15:00
  • @JCabello I've edited the question to remove misunderstanding, hope it's clear now – krul Nov 01 '16 at 15:03

3 Answers3

1

You need to load the initial document, after that remove the Observation tags from the document. Loop Observation tags and create new document in which you add the Observation tag item. In docList you have all new documents.

        var result = doc.Root.Elements().Where(x => x.Name == "Observation").ToList();

        doc.Root.Elements().Where(x => x.Name == "Observation").Remove();

        List<XDocument> docList = new List<XDocument>();
        foreach(var el in result)
        {
            XDocument d = new XDocument(doc);

            d.Root.Add(el);

            docList.Add(d);
        }
mybirthname
  • 17,949
  • 3
  • 31
  • 55
1

XSLT 2.0 (as supported by Saxon https://www.nuget.org/packages/Saxon-HE/) allows you to transform an XML document into multiple, here is one approach to split your input into several files:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">

    <xsl:param name="tags-per-doc" as="xs:integer" select="3"/>

    <xsl:strip-space elements="*"/>
    <xsl:output indent="yes"/>

    <xsl:template match="/">
        <xsl:for-each-group select="//Tag" group-adjacent="(position() - 1) idiv $tags-per-doc">
            <xsl:result-document href="result{position()}.xml">
                <xsl:apply-templates select="/*"/>
            </xsl:result-document>
        </xsl:for-each-group>
    </xsl:template>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="Observation">
        <xsl:if test="current-group() intersect *">
            <xsl:copy>
                <xsl:apply-templates select="@*, node()[. intersect current-group() or not(self::Tag)]"/>
            </xsl:copy>
        </xsl:if>
    </xsl:template>

</xsl:stylesheet>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
0

I think that you best option is setting up a model for the data you have.

public class Observation
{
    public string Command { get; set; }

    public List<Tag> Tags { get; set; }
}

[...] // Define also de Tag class

Then you can easily read the xml with LINQ to XML, process the models with the criteria you want and save it back using LINQ to XML.

I really feel that it's out of the scope of the question to learn how to use LINQ to XML, so I'm referring you to another question that deals with it: Parse xml using LINQ to XML to class objects

And please, try not to use directly the data as raw rows and then saving it again, any change you want to make after that will be a nightmare.

Community
  • 1
  • 1
Yeray Cabello
  • 411
  • 3
  • 12