0

My Input

<FinInstrmRptgTxRpt> 
<Tx><New><TxId>61810</TxId><ExctgPty>ABC</ExctgPty></New></Tx> 
<Tx><New><TxId>618101</TxId><ExctgPty>ABC</ExctgPty></New></Tx> 
<Tx><New><TxId>61810</TxId><ExctgPty>ABX</ExctgPty></New></Tx> 
<Tx><New><TxId>618102</TxId><ExctgPty>XYZ</ExctgPty></New></Tx> 
<Tx><New><TxId>618102</TxId><ExctgPty>XYZ</ExctgPty></New></Tx> 
<Tx><New><TxId>61810</TxId><ExctgPty>XYZ</ExctgPty></New></Tx>
</FinInstrmRptgTxRpt>

Output should look like

<FinInstrmRptgTxRpt> 
<Tx><New><TxId>618101</TxId><ExctgPty>ABC</ExctgPty></New></Tx> 
<Tx><New><TxId>618102</TxId><ExctgPty>XYZ</ExctgPty></New></Tx> 
<Tx><New><TxId>61810</TxId><ExctgPty>XYZ</ExctgPty></New></Tx>
</FinInstrmRptgTxRpt>

In short I would like to remove duplicates from the xml based on TxId and keep the last line of the duplicate occurrence in the data.

I tried using below code but some reason duplicates (like in Python dataframe keep last)are not removed from the output.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:key name="TxIdKeyList" match="Tx" use="TxId"/>
 
 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
 "Tx[ not( generate-id(current()) 
 = 
            generate-id(
                key('TxIdKeyList', 'TxId')[last()])
            )
      ]"/>
</xsl:stylesheet>
Kal
  • 1
  • 1
  • "Keep the last line"? Your samples as shown have one or two lines only so don't tell us about lines you want to keep or remove, at least name the XML elements. And it would help us understand the problem if you showed the input data in a well formatted way to easily allow us to spot duplicates. And if you use Python and dataframes, why the XSLT 3 tag? – Martin Honnen Mar 26 '22 at 17:43
  • See if this helps: https://stackoverflow.com/a/68212390/3016153 – michael.hor257k Mar 26 '22 at 18:52
  • I have now improved readability of the post. Apologies for earlier errors in data. I need to do this because we are not parsing this data in Python so need to do in xslt. I feel I am almost there. Just small bug in generate-id line some where. – Kal Mar 26 '22 at 22:06
  • Why are you using Muenchian grouping for this, if it's XSLT 3.0? – Michael Kay Mar 26 '22 at 23:20
  • I am not very familiar with xslt, so reading and trying to solve this problem at hand. – Kal Mar 26 '22 at 23:28

2 Answers2

0

TxId is not a child of Tx. So try use="New/TxId" in the xsl:key definition.

Also, I think key('TxIdKeyList', 'TxId') should be key('TxIdKeyList', New/TxId)

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
0

The key should be <xsl:key name="TxIdKeyList" match="Tx" use="New/TxId"/>, the template

<xsl:template match="Tx[not(generate-id() = generate-id(key('TxIdKeyList', New/TxId)[last()]))]"/>

current() in an XSLT 1.0 match pattern is not allowed and the second argument of the key function is an XPath expression, usually not a string literal.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • No change in the output, please see below – Kal Mar 27 '22 at 08:57
  • @Kal, get some coffee, I posted `` and explained "the second argument the key function is an XPath expression, usually not a string literal", yet you make the same error again in your attempt where you put a meaningless string literal as the second argument of your key call `key('TxIdKeyList', 'New/TxId')`. My suggestion does work: https://xsltfiddle.liberty-development.net/eiorv2o, your errors are your own. – Martin Honnen Mar 27 '22 at 09:37
  • Sorry I tried without quotes before but that did not work hence added single quote thinking function need that. Please see below https://xsltfiddle.liberty-development.net/bF2MmYm/1 – Kal Mar 27 '22 at 09:42
  • @Kal, that is quite a different input with a default namespace declaration changing the scenario. Read up on any answer here how to deal with a default namespace in XSLT 1, then you should know how to adapt the XSLT to the changed input. – Martin Honnen Mar 27 '22 at 09:45
  • This has helped enormously. Thank you all for your help. – Kal Mar 27 '22 at 10:22
  • https://xsltfiddle.liberty-development.net/bF2MmYm/3 is final version – Kal Mar 27 '22 at 10:22
  • @Kal, if you are using XSLT 3 anyway I would suggest to rewrite the `generate-id` based comparison into one using the `is` operator: `` – Martin Honnen Mar 27 '22 at 15:50