0

In relational database parlance, how I would express what I need to do is to normalize a (XML) relation from 2nd normal form to 3rd normal form. Allow me to explain by example:

Example XML input

<?xml version="1.0" encoding="windows-1252"?>
<Root>
    <RECORD>
        <SESSIONID>303934</SESSIONID>
        <CASEID>1051-2021</CASEID>
        <CLIENTID>1051</CLIENTID>
        <SCORETYPECODE>CIRCUMSTANCES</SCORETYPECODE>
        <SCORECODE>TRAINING12</SCORECODE>
        <ASSESSMENTPHASECODE>PRE</ASSESSMENTPHASECODE>
    </RECORD>
    <RECORD>
        <SESSIONID>303934</SESSIONID>
        <CASEID>1051-2021</CASEID>
        <CLIENTID>1051</CLIENTID>
        <SCORETYPECODE>CIRCUMSTANCES</SCORETYPECODE>
        <SCORECODE>MATERIAL3</SCORECODE>
        <ASSESSMENTPHASECODE>PRE</ASSESSMENTPHASECODE>
    </RECORD>
    <RECORD>
        <SESSIONID>303805</SESSIONID>
        <CASEID>2041-2021</CASEID>
        <CLIENTID>2041</CLIENTID>
        <SCORETYPECODE>GOALS</SCORETYPECODE>
        <SCORECODE>CONFIDENCE1</SCORECODE>
        <ASSESSMENTPHASECODE>POST</ASSESSMENTPHASECODE>
    </RECORD>
    <RECORD>
        <SESSIONID>303805</SESSIONID>
        <CASEID>2041-2021</CASEID>
        <CLIENTID>2041</CLIENTID>
        <SCORETYPECODE>SATISFACTION</SCORETYPECODE>
        <SCORECODE>SITUATIONIMPROVED2</SCORECODE>
        <ASSESSMENTPHASECODE>POST</ASSESSMENTPHASECODE>
    </RECORD>
</Root>

Here are the most relevant bits of the XSD defining the transformed XML:

XSD xnippet

...
      <xs:element name="ClientAssessments" type="ClientAssessments"  maxOccurs="1">
        <xs:unique name="UniqueClientAssessment">
          <xs:selector xpath=".//ClientAssessment"/>
          <xs:field xpath=".//ClientId"/>
          <xs:field xpath=".//CaseId"/>
          <xs:field xpath=".//SessionId"/>
        </xs:unique>
      </xs:element>
...
  <xs:complexType name="ClientAssessments">
    <xs:sequence>
      <xs:element name="ClientAssessment" type="ClientAssessment" maxOccurs="unbounded" minOccurs="1"/>
    </xs:sequence>
  </xs:complexType>
...
<xs:complexType name="ClientAssessment">
    <xs:sequence>
      <xs:element name="ClientId" type="NonEmptyOrLeadingTrailingSpacesString50Char" />
      <xs:element name="CaseId" type="NonEmptyOrLeadingTrailingSpacesString50Char" />
      <xs:element name="SessionId" type="NonEmptyOrLeadingTrailingSpacesString50Char" />
      <xs:element name="Assessments">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="Assessment" type="Assessment" maxOccurs="6" minOccurs="1"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="Assessment">
    <xs:sequence>
      <xs:element name="ScoreTypeCode" type="NonEmptyString" />
      <xs:element name="AssessmentPhaseCode" type="NonEmptyString" />
      <xs:element name="AssessedByCode" type="NonEmptyOrLeadingTrailingSpacesString" minOccurs="0" />
      <xs:element name="Scores">
        <xs:complexType>
          <xs:sequence>
            <xs:element  name="ScoreCode" type="NonEmptyString"  maxOccurs="unbounded" minOccurs="1"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
  </xs:complexType>
...

Here is the XSLT which I've created to transform to the destination's XML so far:

XSLT so far

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" encoding="utf-8" indent="yes"/>

<xsl:template match="Root">
    <ClientAssessments>
        <xsl:apply-templates select="RECORD"/>
    </ClientAssessments>
</xsl:template>

<xsl:template match="RECORD">
    <ClientAssessment>
        <SessionID><xsl:value-of select="SESSIONID"/></SessionID>
        <CaseID><xsl:value-of select="CASEID"/></CaseID>
        <ClientID><xsl:value-of select="CLIENTID"/></ClientID>
        <Assessments>
            <Assessment>
                <ScoreTypeCode><xsl:value-of select="SCORETYPECODE"/></ScoreTypeCode>
                <AssessmentPhaseCode><xsl:value-of select="ASSESSMENTPHASECODE"/></AssessmentPhaseCode>
                <Scores>
                    <ScoreCode><xsl:value-of select="SCORECODE"/></ScoreCode>
                </Scores>
            </Assessment>
        </Assessments>
    </ClientAssessment>
</xsl:template>

</xsl:transform>

Which produces the following result:

Example XML ouput

<?xml version="1.0" encoding="utf-8"?>
<ClientAssessments>
  <ClientAssessment>
    <SessionID>303934</SessionID>
    <CaseID>1051-2021</CaseID>
    <ClientID>1051</ClientID>
    <Assessments>
      <Assessment>
        <ScoreTypeCode>CIRCUMSTANCES</ScoreTypeCode>
        <AssessmentPhaseCode>PRE</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>TRAINING12</ScoreCode>
        </Scores>
      </Assessment>
    </Assessments>
  </ClientAssessment>
  <ClientAssessment>
    <SessionID>303934</SessionID>
    <CaseID>1051-2021</CaseID>
    <ClientID>1051</ClientID>
    <Assessments>
      <Assessment>
        <ScoreTypeCode>CIRCUMSTANCES</ScoreTypeCode>
        <AssessmentPhaseCode>PRE</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>MATERIAL3</ScoreCode>
        </Scores>
      </Assessment>
    </Assessments>
  </ClientAssessment>
  <ClientAssessment>
    <SessionID>303805</SessionID>
    <CaseID>2041-2021</CaseID>
    <ClientID>2041</ClientID>
    <Assessments>
      <Assessment>
        <ScoreTypeCode>GOALS</ScoreTypeCode>
        <AssessmentPhaseCode>POST</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>CONFIDENCE1</ScoreCode>
        </Scores>
      </Assessment>
    </Assessments>
  </ClientAssessment>
  <ClientAssessment>
    <SessionID>303805</SessionID>
    <CaseID>2041-2021</CaseID>
    <ClientID>2041</ClientID>
    <Assessments>
      <Assessment>
        <ScoreTypeCode>SATISFACTION</ScoreTypeCode>
        <AssessmentPhaseCode>POST</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>SITUATIONIMPROVED2</ScoreCode>
        </Scores>
      </Assessment>
    </Assessments>
  </ClientAssessment>
</ClientAssessments>

So far, so good. However, as per the XSD the \<Assessments\> child node in each \<ClientAssessment\> node needs to contain all the \<Assessment\> nodes for a given \<ClientAssessment\> (for which the \<SessionID\> is a key). So this is what it needs to look like:

Desired XML ouput

<?xml version="1.0"?>
<ClientAssessments>
  <ClientAssessment>
    <SessionID>303934</SessionID>
    <CaseID>1051-2021</CaseID>
    <ClientID>1051</ClientID>
    <Assessments>
      <Assessment>
        <ScoreTypeCode>CIRCUMSTANCES</ScoreTypeCode>
        <AssessmentPhaseCode>PRE</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>TRAINING12</ScoreCode>
        </Scores>
      </Assessment>
      <Assessment>
        <ScoreTypeCode>CIRCUMSTANCES</ScoreTypeCode>
        <AssessmentPhaseCode>PRE</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>MATERIAL3</ScoreCode>
        </Scores>
      </Assessment>
    </Assessments>
  </ClientAssessment>
  <ClientAssessment>
    <SessionID>303805</SessionID>
    <CaseID>2041-2021</CaseID>
    <ClientID>2041</ClientID>
    <Assessments>
      <Assessment>
        <ScoreTypeCode>GOALS</ScoreTypeCode>
        <AssessmentPhaseCode>POST</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>CONFIDENCE1</ScoreCode>
        </Scores>
      </Assessment>
      <Assessment>
        <ScoreTypeCode>SATISFACTION</ScoreTypeCode>
        <AssessmentPhaseCode>POST</AssessmentPhaseCode>
        <Scores>
          <ScoreCode>SITUATIONIMPROVED2</ScoreCode>
        </Scores>
      </Assessment>
    </Assessments>
  </ClientAssessment>
</ClientAssessments>

I hope I've explained that well enough. How do I accomplish this "normalization" in the XSLT 1.0?

Follow-up question: what's a good XSLT reference for such transformations that someone could suggest? I found the w3schools one didn't really cover enough for me to figure it out myself.

Captain Jack Sparrow
  • 971
  • 1
  • 11
  • 28
  • This is a [grouping](https://stackoverflow.com/tags/xslt-grouping/info) question. In XSLT 1.0 this is is best accomplished using the Muenchian method: http://www.jenitennison.com/xslt/grouping/muenchian.html. You can find many examples of Muenchian grouping on SO. – michael.hor257k Jun 11 '21 at 03:29
  • Thank you for steering me in the right direction. I found [link](https://stackoverflow.com/questions/10863491/xslt-grouping-on-multiple-keys-using-muenchian-method) assisted me the most. – OneOfTheDamons Jun 12 '21 at 12:17

1 Answers1

0

Normalization refers to the change you're making to the schema; the corresponding change you need to make to the instances is referred to in XSLT-land as "grouping". The subject is well covered in XSLT textbooks and tutorials, and in SO questions and answers. The key thing you need to know is the solution is very easy in XSLT 2.0+ (using the xsl:for-each-group instruction), but pretty tortuous in XSLT 1.0 (using Muenchian grouping). So use a processor that supports 2.0+ if you possibly can.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Unfortunately the processor I am using says it only supports XSLT 1.1. I figured out how I could apply the Muenchian method with some steering in the right direction from @michael.hor257k above. – OneOfTheDamons Jun 12 '21 at 12:21