0

It seemed like an easy task but I am totally stuck now. I have the following XML:

<?xml version="1.0" encoding="UTF-8"?>
<Items>
<Item>
    <ITEM_CODE>ITEM_CODE</ITEM_CODE>
    <ITEM_NAME>ITEM_NAME</ITEM_NAME>
    <ITEM_ALTERNATE_NAME>ITEM_ALTERNATE_NAME</ITEM_ALTERNATE_NAME>
    <ITEM_CATEGORY_CODE>ITEM_CATEGORY_CODE</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>15031</ITEM_CODE>
    <ITEM_NAME>Outer Carton</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150529</ITEM_CODE>
    <ITEM_NAME>Outer Carton</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150999</ITEM_CODE>
    <ITEM_NAME>Outer Carton</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150988</ITEM_CODE>
    <ITEM_NAME>test</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
</Items>

If <ITEM_NAME> elements have duplicate contents those should be renamed with a suffix, e.g. a counter value. I came up with this XSLT:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output encoding="UTF-8" method="xml" indent="yes"/>
    
<xsl:key name="keyItemName" match="Item" use="concat(ITEM_CODE , '|', ITEM_NAME)"/>

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="Items">
    <Items>
        <xsl:apply-templates select="@*|node()"/>
    </Items>    
</xsl:template>

<xsl:template match="ITEM_NAME">
    
    <xsl:for-each select="parent::Item[generate-id()=generate-id(key('keyItemName',concat(ITEM_CODE , '|', ITEM_NAME))[1])]">
        <xsl:variable name="number">
            <xsl:number/>
        </xsl:variable>
        <ITEM_NAME>
            <xsl:value-of select="concat(ITEM_NAME,'-',$number)"/>
        </ITEM_NAME>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

It gives me this output:

<?xml version="1.0" encoding="UTF-8"?>
<Items>
<Item>
    <ITEM_CODE>ITEM_CODE</ITEM_CODE>
    <ITEM_NAME>ITEM_NAME-1</ITEM_NAME>
    <ITEM_ALTERNATE_NAME>ITEM_ALTERNATE_NAME</ITEM_ALTERNATE_NAME>
    <ITEM_CATEGORY_CODE>ITEM_CATEGORY_CODE</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>15031</ITEM_CODE>
    <ITEM_NAME>Outer Carton-2</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150529</ITEM_CODE>
    <ITEM_NAME>Outer Carton-3</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150999</ITEM_CODE>
    <ITEM_NAME>Outer Carton-4</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150988</ITEM_CODE>
    <ITEM_NAME>test-5</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
</Items>

But I expect this output:

<?xml version="1.0" encoding="UTF-8"?>
<Items>
<Item>
    <ITEM_CODE>ITEM_CODE</ITEM_CODE>
    <ITEM_NAME>ITEM_NAME</ITEM_NAME>
    <ITEM_ALTERNATE_NAME>ITEM_ALTERNATE_NAME</ITEM_ALTERNATE_NAME>
    <ITEM_CATEGORY_CODE>ITEM_CATEGORY_CODE</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>15031</ITEM_CODE>
    <ITEM_NAME>Outer Carton-2</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150529</ITEM_CODE>
    <ITEM_NAME>Outer Carton-3</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150999</ITEM_CODE>
    <ITEM_NAME>Outer Carton-4</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
<Item>
    <ITEM_CODE>150988</ITEM_CODE>
    <ITEM_NAME>test</ITEM_NAME>
    <ITEM_ALTERNATE_NAME/>
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
</Item>
</Items>

In the last <Item> the ITEM_NAME should not be renamed because it is not called "Outer Carton". Also in the first <Item> element no renaming should be happening.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Peter
  • 1,786
  • 4
  • 21
  • 40

2 Answers2

1

Using preceding:: or preceding-sibling:: to count prior instances is not very efficient computationally, but I don't see a way around it here. The approach below does have the benefit that it only counts preceding instances when after checking (with a key, which is very quick) that there are other items with the same name:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output encoding="UTF-8" method="xml" indent="yes"/>

  <xsl:key name="keyItemName" match="ITEM_NAME" use="."/>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="Items">
    <Items>
      <xsl:apply-templates select="@*|node()"/>
    </Items>
  </xsl:template>

  <xsl:template match="ITEM_NAME">
    <xsl:copy>
      <xsl:value-of select="." />
      <xsl:if test="count(key('keyItemName', .)) > 1">
        <xsl:value-of select="concat('-', count(preceding::ITEM_NAME[. = current()]) + 2)"/>
      </xsl:if>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

When run on your sample input, this produces:

<Items>
  <Item>
    <ITEM_CODE>ITEM_CODE</ITEM_CODE>
    <ITEM_NAME>ITEM_NAME</ITEM_NAME>
    <ITEM_ALTERNATE_NAME>ITEM_ALTERNATE_NAME</ITEM_ALTERNATE_NAME>
    <ITEM_CATEGORY_CODE>ITEM_CATEGORY_CODE</ITEM_CATEGORY_CODE>
  </Item>
  <Item>
    <ITEM_CODE>15031</ITEM_CODE>
    <ITEM_NAME>Outer Carton-2</ITEM_NAME>
    <ITEM_ALTERNATE_NAME />
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
  </Item>
  <Item>
    <ITEM_CODE>150529</ITEM_CODE>
    <ITEM_NAME>Outer Carton-3</ITEM_NAME>
    <ITEM_ALTERNATE_NAME />
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
  </Item>
  <Item>
    <ITEM_CODE>150999</ITEM_CODE>
    <ITEM_NAME>Outer Carton-4</ITEM_NAME>
    <ITEM_ALTERNATE_NAME />
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
  </Item>
  <Item>
    <ITEM_CODE>150988</ITEM_CODE>
    <ITEM_NAME>test</ITEM_NAME>
    <ITEM_ALTERNATE_NAME />
    <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
  </Item>
</Items>
JLRishe
  • 99,490
  • 19
  • 131
  • 169
  • Hello JLRishe, Thank you for your answer, it works perfectly. I already thought something with my key was not right since I only want to check ITEM_NAME but not anything else. Best regards, Peter – Peter Feb 14 '13 at 13:45
  • You're welcome, but I think I may have misunderstood your requirements. My answer has a counter (starting at 2) to number identical names. So if you had some other bunch of items with the same names, those would _also_ be numbered 2, 3, 4. But after looking at Tim C's answer, I'm thinking you wanted to number the items from the first item to the last, but only show the number if the name is a duplicate. Is that correct? – JLRishe Feb 14 '13 at 14:00
  • Hello, your solution works fine but since Tim's version only uses keys and no preceding-construct I will go with his solution because the actual XML I have is huge. But I did not get your point: if we only match ITEM_NAME no other items can be changed. Like I said your XSLT works fine, performance wise Tim's is better. Best regards, Peter (+1) – Peter Feb 15 '13 at 08:11
0

You current key seems to join ITEM_NAME and ITEM_CODE, but it looks like you only want ITEM_NAME here

<xsl:key name="keyItemName" match="ITEM_NAME" use="."/>

It also looks like you want the numbering for the suffix to be based on the position of the parent item element. One way to achieve this is to have a template to match the item element, and then pass the number as a parameter to the subsequent mathching templates

<xsl:template match="Item">
   <Item>
      <xsl:apply-templates select="@*|node()">
          <xsl:with-param name="number">
            <xsl:number/>
         </xsl:with-param>
      </xsl:apply-templates>
   </Item>
</xsl:template>

Then, you need a template to match ITEM_NAME elements for which duplicate occurs. This can be done simply by checking there is at least a second element defined in the group for the key:

<xsl:template match="ITEM_NAME[key('keyItemName', .)[2]]">
   <xsl:param name="number"/>

Then, you can just output the element with the suffix.

Here is the full XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <xsl:output encoding="UTF-8" method="xml" indent="yes"/>

   <xsl:key name="keyItemName" match="ITEM_NAME" use="."/>

   <xsl:template match="@*|node()">
      <xsl:copy>
         <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>

   <xsl:template match="Item">
      <Item>
         <xsl:apply-templates select="@*|node()">
            <xsl:with-param name="number">
               <xsl:number/>
            </xsl:with-param>
         </xsl:apply-templates>
      </Item>
   </xsl:template>

   <xsl:template match="ITEM_NAME[key('keyItemName', .)[2]]">
      <xsl:param name="number"/>
      <ITEM_NAME>
         <xsl:value-of select="concat(.,'-',$number)"/>
      </ITEM_NAME>
   </xsl:template>
</xsl:stylesheet>

When applied to your XML, the following is output

<Items>
   <Item>
      <ITEM_CODE>ITEM_CODE</ITEM_CODE>
      <ITEM_NAME>ITEM_NAME</ITEM_NAME>
      <ITEM_ALTERNATE_NAME>ITEM_ALTERNATE_NAME</ITEM_ALTERNATE_NAME>
      <ITEM_CATEGORY_CODE>ITEM_CATEGORY_CODE</ITEM_CATEGORY_CODE>
   </Item>
   <Item>
      <ITEM_CODE>15031</ITEM_CODE>
      <ITEM_NAME>Outer Carton-2</ITEM_NAME>
      <ITEM_ALTERNATE_NAME/>
      <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
   </Item>
   <Item>
      <ITEM_CODE>150529</ITEM_CODE>
      <ITEM_NAME>Outer Carton-3</ITEM_NAME>
      <ITEM_ALTERNATE_NAME/>
      <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
   </Item>
   <Item>
      <ITEM_CODE>150999</ITEM_CODE>
      <ITEM_NAME>Outer Carton-4</ITEM_NAME>
      <ITEM_ALTERNATE_NAME/>
      <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
   </Item>
   <Item>
      <ITEM_CODE>150988</ITEM_CODE>
      <ITEM_NAME>test</ITEM_NAME>
      <ITEM_ALTERNATE_NAME/>
      <ITEM_CATEGORY_CODE>52401</ITEM_CATEGORY_CODE>
   </Item>
</Items>
Tim C
  • 70,053
  • 14
  • 74
  • 93