0

Let's assume, you have the xml below. The goal is to group by FirstName and export the Person into different xml files. Each output xml files should only contain up to X different FirstName.

Below is an example of the desired transformation with X = 3

XML input:

<People>
    <Person>             
        <FirstName>John</FirstName>             
        <LastName>Doe</LastName> 
    </Person> 
    <Person>             
        <FirstName>Jack</FirstName>             
        <LastName>White</LastName> 
    </Person>
    <Person>             
        <FirstName>Mark</FirstName>             
        <LastName>Wall</LastName> 
    </Person>
    <Person>             
        <FirstName>John</FirstName>             
        <LastName>Ding</LastName> 
    </Person> 
    <Person>             
        <FirstName>Cyrus</FirstName>             
        <LastName>Ding</LastName> 
    </Person>  
    <Person>             
        <FirstName>Megan</FirstName>             
        <LastName>Boing</LastName> 
    </Person>
</People>          

XML output 1 with 3 different FirstName

<People>
    <Person>             
        <FirstName>John</FirstName>             
        <LastName>Doe</LastName> 
    </Person> 
    <Person>             
        <FirstName>John</FirstName>             
        <LastName>Ding</LastName> 
    </Person>
    <Person>             
        <FirstName>Jack</FirstName>             
        <LastName>White</LastName> 
    </Person>
    <Person>             
        <FirstName>Mark</FirstName>             
        <LastName>Wall</LastName> 
    </Person>  
</People> 

XML output 2 with the 2 remaining FirstName

<People>
    <Person>             
        <FirstName>Cyrus</FirstName>             
        <LastName>Ding</LastName> 
    </Person>  
    <Person>             
        <FirstName>Megan</FirstName>             
        <LastName>Boing</LastName> 
    </Person>
</People> 

It seems to me that the muenchian grouping can be used along with the to produce multiple output files. However, the core question is where we can set a threshold in number of person before exporting to a new file?

Daniel
  • 165
  • 3
  • 12

1 Answers1

1

Here is an example of doing it in two steps with XSLT 2.0:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs"
  version="2.0">

  <xsl:param name="n" as="xs:integer" select="3"/>

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="People">
    <xsl:variable name="groups" as="element(group)*">
      <xsl:for-each-group select="Person" group-by="FirstName">
        <group>
          <xsl:copy-of select="current-group()"/>
        </group>
      </xsl:for-each-group>
    </xsl:variable>
    <xsl:for-each-group select="$groups" group-by="(position() - 1) idiv $n">
      <xsl:result-document href="group{position()}.xml">
        <People>
          <xsl:copy-of select="current-group()"/>
        </People>
      </xsl:result-document>
    </xsl:for-each-group>
  </xsl:template>

</xsl:stylesheet>

I might try to convert to XSLT 1.0 and EXSLT later.

[edit] Here is an attempt to translate into XSLT 1.0 and EXSLT:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:exsl="http://exslt.org/common"
  extension-element-prefixes="exsl"
  exclude-result-prefixes="exsl"
  version="1.0">

  <xsl:param name="n" select="3"/>

  <xsl:output method="xml" indent="yes"/>

  <xsl:key name="person-by-firstname" 
           match="Person"
           use="FirstName"/>

  <xsl:template match="People">
    <xsl:variable name="groups">
      <xsl:for-each select="Person[generate-id() = generate-id(key('person-by-firstname', FirstName)[1])]">
        <group>
          <xsl:copy-of select="key('person-by-firstname', FirstName)"/>
        </group>
      </xsl:for-each>
    </xsl:variable>
    <xsl:for-each select="exsl:node-set($groups)/group[(position() - 1) mod $n = 0]">
      <exsl:document href="groupTest{position()}.xml">
        <People>
          <xsl:copy-of select="Person | following-sibling::group[position() &lt; $n]/Person"/>
        </People>
      </exsl:document>
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110