3

I would like to use XSLT to transform some XML into JSON.
The XML looks like the following:

<DATA_DS>
    <G_1>
        <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME>
        <ORGANIZATIONID>901</ORGANIZATIONID>
        <ITEMNUMBER>20001</ITEMNUMBER>
        <ITEMDESCRIPTION>Item Description 1</ITEMDESCRIPTION>
    </G_1>
    <G_1>
        <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME>
        <ORGANIZATIONID>901</ORGANIZATIONID>
        <ITEMNUMBER>20002</ITEMNUMBER>
        <ITEMDESCRIPTION>Item Description 2</ITEMDESCRIPTION>
    </G_1>
    <G_1>
        <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME>
        <ORGANIZATIONID>901</ORGANIZATIONID>
        <ITEMNUMBER>20003</ITEMNUMBER>
        <ITEMDESCRIPTION>Item Description 3</ITEMDESCRIPTION>
    </G_1>
</DATA_DS>

I expect the JSON to look like the following:

    [
        {
            "Item_Number":"20001",
            "Item_Description":"Item Description 1"
        },
        {
            "Item_Number":"20002",
            "Item_Description":"Item Description 2"
        },
        {
            "Item_Number":"20003",
            "Item_Description":"Item Description 3"
        }
    ]

What is the recommended way to do this?

I am considering two approaches:

  1. Try using the fn:xml-to-json function, as defined at https://www.w3.org/TR/xpath-functions-31/#func-xml-to-json. But as I understand, the input XML must follow a specific format defined at: https://www.w3.org/TR/xpath-functions-31/schema-for-json.xsd. And I also need the field names in the output JSON to be specifically "Item_Number" and "Item_Description".

  2. Manually code the bracket and brace characters, "[", "]", "{", and "}", along with the field names "Item_Number" and "Item_Description". Then use a standard function to list the values and ensure that any special characters are handled properly. For example, the "&" character should appear normally in the JSON output.

What is the recommended way to do this, or is there a better way that I have not considered?

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
Henry H
  • 43
  • 1
  • 4

3 Answers3

4

I would take the first approach, but start with transforming the given input to the XML format expected by the xml-to-json() function. This could be something like:

XSLT 3.0

<xsl:stylesheet version="3.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/2005/xpath-functions">
<xsl:output method="text" encoding="UTF-8"/>

<xsl:template match="/G_1">
    <!-- CONVERT INPUT TO XML FOR JSON -->
    <xsl:variable name="xml">
        <array>
            <xsl:for-each-group select="*" group-starting-with="ORGANIZATION_NAME">
                <map>
                    <string key="Item_Number">
                        <xsl:value-of select="current-group()[self::ITEMNUMBER]"/>
                    </string>
                    <string key="Item_Description">
                        <xsl:value-of select="current-group()[self::ITEMDESCRIPTION]"/>
                    </string>
                </map>
            </xsl:for-each-group>
        </array>
    </xsl:variable>
    <!-- OUTPUT -->
    <xsl:value-of select="xml-to-json($xml)"/>
</xsl:template>

</xsl:stylesheet>

Demo: https://xsltfiddle.liberty-development.net/bFWR5DQ

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • Thank you, michael.hor257k. Actually, I realized later that my original XML had only one set of G_1 tags, and it was missing the DATA_DS tags on the outside. But I was able to change your solution to match the revised XML input. – Henry H Sep 25 '19 at 19:34
  • Here is the original XML before I revised it: My Company 190120001Item Description 1 My Company 190120002Item Description 2 My Company 190120003Item Description 3 – Henry H Sep 25 '19 at 21:16
1

For simple mappings like that you can also directly construct XPath 3.1 arrays and maps i.e. in this case an array of maps:

  <xsl:template match="DATA_DS">
      <xsl:sequence select="array { G_1 ! map { 'Item_Number' : string(ITEMNUMBER), 'Item_Description' : string(ITEMDESCRIPTION) } }"/>
  </xsl:template>

Then serialize as JSON with <xsl:output method="json" indent="yes"/>: https://xsltfiddle.liberty-development.net/ejivdGS

The main disadvantage is that maps have no order so you can't control the order of the items in a map, for instance for that example and the used Saxon version Item_Description is output before Item_Number.

But in general transforming to the format for xml-to-json provides more flexibility and also allows you to control the order as the function preserves the order in the XML representation of JSON.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • 1
    Note that Saxon has an extension `` which can be used to control the order of properties during JSON serialization. – Michael Kay Sep 25 '19 at 20:48
  • 1
    @MichaelKay The order of the properties has no meaning to any system that would use that JSON object -- as the order of map entries also makes little sense and is not guaranteed. When we want to preserve a given ordering, the way to do this is to use arrays -- both in a JSON object or in the value of a map entry – Dimitre Novatchev Sep 27 '19 at 17:21
  • 2
    The order makes no difference to software that's reading the data, but it makes a huge difference to any human readers. I added this feature because I have a vocabulary where objects typically have 9 simple-valued properties and 1 tree-valued property (rather like attributes and children in XML..), and finding your way around the data is vastly easier if the tree-valued property is output last. We recognise the need for indentation, after all, for human readability, and I found that consistent property order is just as important for readability as indentation. – Michael Kay Sep 27 '19 at 19:03
0

This is the result of taking the solution posted by michael.hor257k and applying it to my revised input XML:

<xsl:stylesheet version="3.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="http://www.w3.org/2005/xpath-functions">
    <xsl:output method="text" encoding="UTF-8"/>

    <xsl:template match="/DATA_DS">
        <!-- CONVERT INPUT TO XML FOR JSON -->
        <xsl:variable name="xml">
            <array>
                <xsl:for-each  select="G_1">
                <map>
                    <string key="Item_Number">
                        <xsl:value-of select="ITEMNUMBER"/>
                    </string>
                    <string key="Item_Description">
                        <xsl:value-of select="ITEMDESCRIPTION"/>
                    </string>
                </map>
            </xsl:for-each>
            </array>
        </xsl:variable>
        <!-- OUTPUT -->
         <xsl:value-of select="xml-to-json($xml)"/>
    </xsl:template>

</xsl:stylesheet>

Henry H
  • 43
  • 1
  • 4