0

I am trying to find an xslt solution to the following problem I have.

I want to find a set of 3 subsequent rows that share the node name and an attribute but have a different values. The first row in the input contains an identifier, the second and third row contain values from a source system. I want to find the sets where the second and third row have different values.

E.g.

<eba7:mi235 contextRef="I-2014-E-dim-x43-x9-x156-x51-x14">78923</eba7:mi235>
<eba7:mi235 contextRef="I-2014-E-dim-x43-x9-x156-x51-x14">1111</eba7:mi235>
<eba7:mi235 contextRef="I-2014-E-dim-x43-x9-x156-x51-x14">2222</eba7:mi235>

There might also be sets of rows with only an identifier, a set of a row with an identifier and only one row with a value from the source system or a set of rows where the second and third row have the same value.

E.g.

<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x24-x195-x10-x4">78748</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x24-x195-x10-x4">0</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x25-x195-x10-x4">78804</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x25-x195-x10-x4">12345</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x25-x195-x10-x4">12345</eba7:mi310>

These I don't want to find in the output.

The output I want to create is

<eba7:mi235 id="78923" value1="1111" value2="2222" />

The structure of the input is such that the rows are always ordered like this. So I tried to access them using position, but that didn't work.

Could anybody point me in the right direction? Is using position the right way?

I have attached an file with the input data below

Thanks.

Paul.

<?xml version="1.0" encoding="utf-8"?>
<xbrl xml:lang="en" xmlns="http://www.xbrl.org/2003/instance" xmlns:eba7="http://www.eba.europa.eu/xbrl/crr/dict/met" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:link="http://www.xbrl.org/2003/linkbase">
<link:schemaRef xlink:type="simple" xlink:href="http://www.eba.europa.eu/eu/fr/xbrl/crr/fws/corep/its-2013-02/2014-07-31/mod/corep_con.xsd" />
<context id="I-2014-E">
<entity>
  <identifier scheme="http://www.dnb.nl/id">578</identifier>
</entity>
<period>
  <instant>2014-12-31</instant>
</period>
</context>  
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x24-x195-x10-x4">78748</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x24-x195-x10-x4">0</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x25-x195-x10-x4">78804</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x25-x195-x10-x4">12345</eba7:mi310>
<eba7:mi310 contextRef="I-2014-E-dim-x42-x9-x25-x195-x10-x4">12345</eba7:mi310>
<eba7:mi235 contextRef="I-2014-E-dim-x43-x9-x156-x51-x14">78923</eba7:mi235>
<eba7:mi235 contextRef="I-2014-E-dim-x43-x9-x156-x51-x14">1111</eba7:mi235>
<eba7:mi235 contextRef="I-2014-E-dim-x43-x9-x156-x51-x14">2222</eba7:mi235>
</xbrl>
Paul
  • 37
  • 5
  • This is a grouping question. Please indicate which version of XSLT are you using - answers are dramatically different for each. -- What if there is a sequence of 4? – michael.hor257k Feb 15 '15 at 12:26
  • XSLT 1.0 is being used A sequence of 4 is possible: 3 values for a specific identifier. I had left that out to keep the example as simple as possible. I expect I could expand the solution from Lingamurthy CS for that case. – Paul Feb 15 '15 at 16:05
  • As I said in my answer, the question is not too well defined. What's even worse is that your input example is not well-formed, and cannot be used for testing. – michael.hor257k Feb 15 '15 at 17:59
  • I have updated the example file. It wasn't well-formed: the second wasn't copied correctly. – Paul Feb 15 '15 at 19:45
  • What's the reason for this, out of interest? The EBA filing rules prohibit duplicate facts, so an instance such as this would not be accepted. – Charles Mager Feb 16 '15 at 07:52
  • You're right, the EBA doesn't allow duplicate facts. Some systems are based on filling in the individual templates, resulting in the same data point being reported multiple times. I want to find those facts, especially if they contain different values. – Paul Feb 16 '15 at 22:05

2 Answers2

2

I don't think the question is defined well enough; it can be interpreted in several ways.

If we assume that you want to:

  1. Group all the given elements based on both the tag name and the @contextRef value being the same; with the mutual position of the elements being irrelevant for this purpose;

  2. Count the distinct values in each group; if there are three or more, write an element with the common tag name to the output, and add a numbered attribute for each distinct value in this group;

then it would be probably best to do something like:

XSLT 1.0

<xsl:stylesheet version="1.0"
xmlns:xbrli="http://www.xbrl.org/2003/instance" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>

<xsl:key name="k1" match="*" use="concat(name(), '|', @contextRef)"/>
<xsl:key name="k2" match="*" use="concat(name(), '|', @contextRef, '|', .)"/>

<xsl:template match="/xbrli:xbrl">
    <xsl:copy>
        <xsl:for-each select="*[count(.|key('k1', concat(name(), '|', @contextRef))[1])=1]">
            <xsl:variable name="distinct-values" select="key('k1', concat(name(), '|', @contextRef)) [count(.|key('k2', concat(name(), '|', @contextRef, '|', .))[1])=1]"/>
            <xsl:if test="count($distinct-values) &gt;= 3">
                <xsl:copy>
                    <xsl:for-each select="$distinct-values">
                        <xsl:attribute name="value{position()}">
                            <xsl:value-of select="."/>
                        </xsl:attribute>
                    </xsl:for-each>
                </xsl:copy>
            </xsl:if>
        </xsl:for-each>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

Applied to the following well-formed test input:

<xbrl xmlns="http://www.xbrl.org/2003/instance" xmlns:eba7="http://www.eba.europa.eu/xbrl/crr/dict/met">
    <eba7:a contextRef="x">11</eba7:a>
    <eba7:a contextRef="x">12</eba7:a>

    <eba7:a contextRef="y">21</eba7:a>
    <eba7:a contextRef="y">22</eba7:a>
    <eba7:a contextRef="y">23</eba7:a>

    <eba7:b contextRef="x">31</eba7:b>
    <eba7:b contextRef="x">32</eba7:b>
    <eba7:b contextRef="x">33</eba7:b>
    <eba7:b contextRef="x">33</eba7:b>

    <eba7:c contextRef="x">41</eba7:c>
    <eba7:c contextRef="x">41</eba7:c>
    <eba7:c contextRef="x">42</eba7:c>
    <eba7:c contextRef="x">42</eba7:c>
</xbrl>

the result will be:

<?xml version="1.0" encoding="utf-8"?>
<xbrl xmlns="http://www.xbrl.org/2003/instance" xmlns:eba7="http://www.eba.europa.eu/xbrl/crr/dict/met">
   <eba7:a value1="21" value2="22" value3="23"/>
   <eba7:b value1="31" value2="32" value3="33"/>
</xbrl>

Note:

  1. You must be familiar with the Muenchian grouping method in order to understand this;

  2. Numbered attributes are not good XML practice. I would suggest you (or the powers that be) reconsider this requirement.

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
1

Would this stylesheet solve your problem:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  xmlns:xbrli="http://www.xbrl.org/2003/instance" version="1.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>

<xsl:key name="elements" match="*" use="@contextRef"/>

<xsl:template match="/xbrli:xbrl">
    <xsl:copy>
        <xsl:apply-templates select="*[@contextRef
                                        and count(key('elements', @contextRef)) = 3 
                                        and key('elements', @contextRef)[2] != key('elements', @contextRef)[3]
                                        and count(. | key('elements', @contextRef)[1]) = 1]"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="*">
    <xsl:copy>
        <xsl:attribute name="id">
            <xsl:value-of select="."/>
        </xsl:attribute>
        <xsl:attribute name="value1">
            <xsl:value-of select="key('elements', @contextRef)[2]"/>
        </xsl:attribute>
        <xsl:attribute name="value2">
            <xsl:value-of select="key('elements', @contextRef)[3]"/>
        </xsl:attribute>
    </xsl:copy>
</xsl:template>
</xsl:stylesheet>

Here, a key is declared to match elements with @contextRef being the identifier. The first template applies templates to the first elements with unique @contextRef(and also those which match other conditions like total elements with that @contextRef must be 3, and the second and thrid elements must not have the same value).

The next template matches these elements(from the first template), and creates the further output.

Lingamurthy CS
  • 5,412
  • 2
  • 13
  • 21
  • Hello Lingamurthy, Thanks. This works fine. Do I understand correctly that your approach is: 1) build a table of all values present, use the contextRef as the identifier. – Paul Feb 15 '15 at 15:36
  • @Paul, I believe michael.hor257k's answer does the right job. My answer doesn't see if the elements names are same. – Lingamurthy CS Feb 15 '15 at 23:31
  • '@Lingamurthy: you're right in missing the different element names. I think it would be relatively easy to extend your approach to include the element name. I agree that Michaels solution is better, yours is easy to understand. – Paul Feb 16 '15 at 22:14