0

First off, Happy Holidays everyone! Ok so I have recently started a new project that requires me to work with a massive XML file, approximately 20k lines and is hard to work with because its not sorted and that has caused a lot of duplicate properties to appear in the nodes, I want to fix this. Using XSLT + Notepad++ should make this easy except I am having a difficult time finding information about XSLT sorting when you do not know every single property and its formatted like <property name="" />.

What the file looks like:

<Items>
     <item id="1">
          <prop name="c">
          <prop name="a">
          <event name="c">
          <event name="a">
          <prop class="b">
               <prop name="a">
               <prop name="c">
          </prop>
          <prop class="a">
               <prop name="b">
               <prop name="a">
          </prop>
     </item>
</items>

What I would like the file to look like:

<Items>
     <item id="1">
          <prop name="a">
          <prop name="b">
          <event name="a">
          <event name="c">
          <prop class="a">
               <prop name="a">
               <prop name="c">
          </prop>
          <prop class="b">
               <prop name="a">
               <prop name="b">
          </prop>
     </item>
</items>

I just want to sort out the <prop name..> by the value of the properties and then the same thing inside the <prop class>

-- Update -- Ok, so I didn't want post parts of the actual xml because its usually easy to swap the code out, however I been playing with this the last few hours and can't seem to get it working.

Here three different types of items from the XML.

<block id="1" name="stone">
    <property name="Material" value="stone"/>
    <property name="Shape" value="Terrain"/>
    <property name="Mesh" value="terrain"/>
    <property name="Texture" value="1"/>
    <property name="Weight" value="100"/>
    <property name="DropScale" value="2"/>
    <property name="LPHardnessScale" value="2"/>
    <drop event="Harvest" name="rockSmall" count="125"/>
    <drop event="Harvest" name="ironFragment" count="5"/>
    <drop event="Destroy" name="rockSmall" count="50"/>
    <drop event="Fall" name="destroyedStone" count="1" prob="1.0" stick_chance=".75"/>
</block>

<block id="154" name="metalReinforcedWoodWedge60">
    <property name="Material" value="metal"/>
    <property name="Shape" value="Wedged60Full"/>
    <property name="Texture" value="380"/>
    <property name="Collide" value="movement,rocket,melee"/>
    <property name="FuelValue" value="100"/>
    <drop event="Destroy" name="woodDebris" count="1"/>
    <property name="CanMobsSpawnOn" value="false"/>
    <drop event="Fall" name="woodDebris" count="1" prob="1.0" stick_chance=".75"/>
    <property class="UpgradeBlock">
        <property name="ToBlock" value="scrapIronWedge60"/>
        <property name="Item" value="scrapIron"/>
        <property name="ItemCount" value="10"/>
        <property name="UpgradeHitCount" value="4"/>
    </property>
    <property name="DowngradeBlock" value="reinforcedWoodWedge60"/>
    <property class="RepairItems">
        <property name="scrapIron" value="10"/>
    </property>
    <property name="Group" value="Building,Basics"/>
</block>

<block id="1146" name="cottonYoung">
    <property name="Class" value="PlantGrowing"/>
    <property name="Material" value="plants"/>
    <property name="Shape" value="BillboardPlant"/>
    <property name="Mesh" value="grass"/>
    <property name="Texture" value="20"/>
    <property name="Collide" value="melee"/>
    <property name="CanDecorateOnSlopes" value="true"/>
    <property name="IsTerrainDecoration" value="true"/>
    <property class="PlantGrowing">
        <property name="Next" value="cotton"/>
        <property name="GrowthRate" value="60"/>
        <property name="IsRandom" value="false"/>
        <property name="FertileLevel" value="1"/>
    </property>
</block>
Nick W.
  • 1,536
  • 3
  • 24
  • 40
  • 1
    Can you explain which value in the data you want to sort on? How would the result look for that sample you have posted (well, if you first fix it to be well-formed XML)? – Martin Honnen Dec 22 '15 at 17:57
  • You have not closed a single `prop` element. Can there be ``, do you then want to group the `prop` elements with a `name` attribute together, e.g. get ``? Or do you only want to sort adjacent `prop` and `event` elements? – Martin Honnen Dec 22 '15 at 18:19
  • 20K lines is pretty small these days as XML goes. – Michael Kay Dec 22 '15 at 18:20

3 Answers3

1

maybe something along the lines of:

<xsl:stylesheet>
  <xsl:template match="/">
    <xsl:for-each select="//item">
      <xsl:for-each select="prop">
        <xsl:sort select="@name | @class">

this is probably not exactly what you're looking for, but you question also is a bit fuzzy. this should give you a good enough starting point to build your own solution, though.

dret
  • 531
  • 3
  • 7
1

Assuming XSLT 2.0 (needs Saxon 9, XmlPrime or another XSLT 2.0 processor) and an input like

<Items>
     <item id="1">
          <prop name="c"/>
          <prop name="a"/>
          <event name="c"/>
          <event name="a"/>
          <prop class="b">
               <prop name="a"/>
               <prop name="c"/>
          </prop>
          <prop class="a">
               <prop name="b"/>
               <prop name="a"/>
          </prop>
     </item>
</Items>

the code

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="item[prop] | prop[prop]">
  <xsl:copy>
    <xsl:copy-of select="@*"/>
    <xsl:for-each-group select="*" group-adjacent="node-name(.)">
      <xsl:apply-templates select="current-group()">
        <xsl:sort select="@*"/> <!-- if there can be more than one attribute on a single child make that select="@name | @class" -->
      </xsl:apply-templates>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

creates the result

<Items>
   <item id="1">
      <prop name="a"/>
      <prop name="c"/>
      <event name="a"/>
      <event name="c"/>
      <prop class="a">
         <prop name="a"/>
         <prop name="b"/>
      </prop>
      <prop class="b">
         <prop name="a"/>
         <prop name="c"/>
      </prop>
   </item>
</Items>

Adapted to your new sample the template doing the work would change to

<xsl:template match="block[property] | property[property]">
  <xsl:copy>
    <xsl:copy-of select="@*"/>
    <xsl:for-each-group select="*" group-adjacent="node-name(.)">
      <xsl:apply-templates select="current-group()">
        <xsl:sort select="@name | @class"/>
      </xsl:apply-templates>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Your code works perfectly for the example I put! Your amazing, however I couldn't tweak it to make it work with my actual XML. I thought the example code represented the actual XML quite well but it doesn't seem to work right. – Nick W. Dec 23 '15 at 13:39
  • Well, what should happen with a sequence like ` `, do you want to move `` up before the ``? – Martin Honnen Dec 23 '15 at 13:44
  • @user1451070, I have edited the answer to show how to adapt it to the code you have added to your question. The problem outlined in the previous comment remains, I am not sure how you want to sort a mixed sequence of elements. – Martin Honnen Dec 23 '15 at 14:03
  • *hangs head in shame* Yea, I think tried every combination I could think of and instead forgot that I changed "block" to "item"... – Nick W. Dec 23 '15 at 14:53
  • Still having an issue, when the property block includes "" it looks like it doesn't want to sort it. Is there a way to sort it by group i suppose? Such as group/Sort all the "" together, Group/sort the "" together, then put the drops together. – Nick W. Dec 23 '15 at 15:57
  • Try whether changing `` to `` gives you the result you want. It will group all elements of the same name together, not only the adjacent ones. – Martin Honnen Dec 23 '15 at 16:02
  • Haha, works like a champ, I actually just figured this out right before you posted this due to one of your examples in a different post. "http://stackoverflow.com/questions/19115109/how-to-use-for-each-group-in-xsl" Thank you! Your awesome. Now just need to do some more research to figure out what this code works lol :) – Nick W. Dec 23 '15 at 16:16
0

If you want to sort the <prop> nodes within an <item> node an also to sort the <prop> nodes within an <prop class="..."> node, here is an approach:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" version="1.0" encoding="ISO-8859-1"/>

<xsl:template match="Items">
<root>
    <xsl:apply-templates select="item">
    </xsl:apply-templates>
</root>
</xsl:template>

<xsl:template match="item">
    <item id="{@id}">
    <xsl:apply-templates select="prop">
        <xsl:sort select="@name" data-type="text" order="ascending"/>
    </xsl:apply-templates>
    <xsl:apply-templates select="event"/>
    </item>
</xsl:template>

<xsl:template match="prop|event">
    <xsl:if test="@class">
        <prop class="{@class}">
            <xsl:apply-templates select="prop">
                <xsl:sort select="@name" data-type="text" order="ascending"/>
            </xsl:apply-templates>
        </prop>
    </xsl:if>
    <xsl:if test="not(@class)">
        <xsl:copy-of select="."/>
    </xsl:if>
</xsl:template>

</xsl:stylesheet>
Little Santi
  • 8,563
  • 2
  • 18
  • 46