3

In our legacy project we are using libxslt which is based on xslt 1.0 version. Now there is a need to generate UUID using the xslt file so that our output xml file contains the UUID.

As per https://stackoverflow.com/a/8127174/3747770 I am out of luck.

Also as per this https://gist.github.com/azinneera/778f69ae6b0049b5edcd69da70072405 we can generate UUID, but using xslt 2.0.

I am new to xslt, and is there any way to convert the https://gist.github.com/azinneera/778f69ae6b0049b5edcd69da70072405 style sheet from version 2.0 to 1.0 or is there any other way to generate UUID using xslt 1.0?

NJMR
  • 1,886
  • 1
  • 27
  • 46
  • See if this helps; https://stackoverflow.com/a/25869149/3016153 – michael.hor257k Jan 31 '20 at 07:24
  • _"I am out of luck."_ Not quite. If by UUID you mean RFC4122 or ITU-T Rec. X.667, then there is no built-in support for that in XSLT.- The well known mechanism to extend XSLT features is the use of extension functions. Almost every XSLT processor expose an API that allow extension function registration. Use that with a well trusted UUID library of your choice. – Alejandro Jan 31 '20 at 20:54
  • This article describes how to generate random numbers, and looks like xslt 1.0. But I think you should have already seen it, what is wrong about it? http://fxsl.sourceforge.net/articles/Random/Casting%20the%20Dice%20with%20FXSL-htm.htm – Sohail Feb 03 '20 at 05:57
  • @Sohail This generates random number, I need to generate UUID v4 based on https://github.com/wmo-im/iwxxm/issues/31#issuecomment-342307281 – NJMR Feb 03 '20 at 06:39
  • 1
    @NJMR You cannot generate UUID v4 in XSLT 1.0. because XSLT 1.0 cannot generate a random number and the [specification](https://www.ietf.org/rfc/rfc4122.txt) states that *"The version 4 UUID is meant for generating UUIDs from truly-random or pseudo-random numbers."* If you're using libxslt, you can use the EXSLT `math:random()` extension function to generate a random number and proceed from there as stated in section 4.4 of the specification. Do note that the stylesheet you seek to convert is meant to generate a version 1 UUID, based on the current timestamp. – michael.hor257k Feb 03 '20 at 15:12
  • Setting a C++ enviroment with external libraries (`boost/uuid` is OK, but `libxslt` is complex) looks like to much. But I will help you showing you how easy is the extension function approach with this [python example](https://repl.it/repls/JollyFumblingFinance). Do note that Python use libxslt under the hood, thus is almost the same API. – Alejandro Feb 04 '20 at 16:46

2 Answers2

2

As I have stated in the comment to your question, if you are using the libxslt processor, you can use the EXSLT math:random() extension function to generate a sequence of random numbers that will eventually form a version 4 UUID.

Here's an implementation example:

XSLT 1.0 + EXSLT

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:math="http://exslt.org/math"
xmlns:func="http://exslt.org/functions"
xmlns:my="www.example.com/my"
extension-element-prefixes="func math my">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<func:function name="my:UUID4">
    <!-- https://www.ietf.org/rfc/rfc4122.txt -->
    <func:result>
        <!-- 8 -->
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:text>-</xsl:text>      
        <!-- 4 -->
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <!-- version identifier -->
        <xsl:text>-4</xsl:text>     
        <!-- 3 -->
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:text>-</xsl:text>      
        <!-- 1* -->
        <xsl:value-of select="substring('89ab', floor(4*math:random()) + 1, 1)" />
        <!-- 3 -->
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:text>-</xsl:text>      
        <!-- 12 -->
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" />
        <xsl:value-of select="substring('0123456789abcdef', floor(16*math:random()) + 1, 1)" /> 
    </func:result>
</func:function>

<xsl:template match="/items">
    <output>
        <xsl:for-each select="item">
            <item id="{my:UUID4()}">
                <xsl:value-of select="." />
            </item>
        </xsl:for-each>
    </output>
</xsl:template>

</xsl:stylesheet>

When applied to the following input:

XML

<items>
    <item>1</item>
    <item>2</item>
    <item>3</item>
    <item>4</item>
    <item>6</item>
    <item>7</item>
    <item>8</item>
    <item>9</item>
</items>

I got the following result:

Result 1

<?xml version="1.0" encoding="UTF-8"?>
<output>
  <item id="77587d4c-1ef6-4aaf-9f97-398dee70fa25">1</item>
  <item id="148e4218-c881-41d3-af61-cab4b5d0251f">2</item>
  <item id="3a02b568-3200-46ff-993c-3bea9724d6ce">3</item>
  <item id="28de29bd-39f4-4eed-979a-765c290652a1">4</item>
  <item id="7c767fa7-c0b7-4187-9f86-d3876ec1be8a">6</item>
  <item id="aca2261f-e837-4a2d-a555-0c81b2c7f7a2">7</item>
  <item id="b7ecb7bd-8c3e-475d-ba17-4c62c1c3d90b">8</item>
  <item id="d28f95e8-452c-474f-9c9a-11e09cd948ae">9</item>
</output>

Subsequent runs produced:

Result 2

<?xml version="1.0" encoding="UTF-8"?>
<output>
  <item id="6eb63a8e-599d-450a-8970-a758b73aa121">1</item>
  <item id="86b247bf-81c8-47ce-9375-4a35e44fcde7">2</item>
  <item id="cbc04786-9e90-4331-a9d3-47955c7d5a99">3</item>
  <item id="9f82f8d0-9934-499e-8783-61087ebce2f7">4</item>
  <item id="5b77da5b-f28f-45a7-82f4-a47b6b1aa7b2">6</item>
  <item id="7eab11bc-209f-4100-b4e6-1cc0f73beda0">7</item>
  <item id="7f4151f4-6166-4406-9ee4-e7de325537d0">8</item>
  <item id="2185c4b8-6a74-4b97-93b4-872b2c0e1f5e">9</item>
</output>

Result 3

<?xml version="1.0" encoding="UTF-8"?>
<output>
  <item id="784b9cd0-a77a-4719-ad0b-183a970b6785">1</item>
  <item id="4dbed80b-4c82-4dde-8a0a-8b29471bdbbf">2</item>
  <item id="0297ad52-3070-4b6a-a28b-a9c7c4607027">3</item>
  <item id="8e402219-3fbf-4025-827b-c95ae4e12ea0">4</item>
  <item id="140c8fad-2d93-4b77-b548-5a150f350d81">6</item>
  <item id="5ca365ac-43dd-41fa-9fa7-6237971576aa">7</item>
  <item id="6ac7bb94-88cd-442e-8c3b-933ca3d53fb5">8</item>
  <item id="3cc77134-77ee-4405-bf33-92e6dc7bfdc1">9</item>
</output>

and so on.

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • Great effort. One thing to note is this advise in [Saxon documentation](https://www.saxonica.com/html/documentation/functions/exslt-math/random.html): _"The function is implemented, but it doesn't work well: since it is not a pure function, Saxon sometimes optimizes it so that successive calls all return the same value"_ – Alejandro Feb 05 '20 at 22:17
  • I don't see how Saxon implementation notes are relevant to libxslt. I don't know if there is a similar documentation for the libxslt implementation. In my limited tests, some applications (e.g. AppleScript) may get a similar (but not identical) result for the very first call of the function within some time period (which I was unable to determine). In practical terms, it would mean that the first UUID in a series will always start with the same character. I wasn't able to reproduce this when calling the transformation from another application (as you can see from the results above). – michael.hor257k Feb 05 '20 at 22:53
  • @michael.hor257k: Some times I get the UUID with less number of characters... – NJMR Apr 07 '20 at 06:40
  • I am afraid I don't see how that's possible. Unless your processor "sometimes" generates a random value of 1 - which seems very unlikely. – michael.hor257k Apr 07 '20 at 06:54
  • I would be careful with the solution above! I used and created a huge number of UUIDs for my use case with this. In the end, we found out that it causes a lot of duplicate UUIDs – BernhardS Feb 22 '23 at 12:36
  • @BernhardS What is "a huge number" and how many is "a lot"? – michael.hor257k Feb 22 '23 at 12:42
  • @michael.hor257k it was exactly 5974983 UUIDs i generated. I have not counted the number of duplicates but there were several cause we found 2 by manual search within a few minutes. So because the maximum number of combinations are 2¹²², it brings me to the assumption that the problem is based on the math:random in any kind. Maybe i can reproduce it tomorrow to give you the exact number of duplicates, we solved it now by a node - script (randomUUID) – BernhardS Feb 22 '23 at 20:29
  • @BernhardS *"we found 2 by manual search within a few minutes."* That sounds like a problem with your system or with your implementation. – michael.hor257k Feb 22 '23 at 21:44
  • 1
    @BernhardS I have just now generated 1m UUIDs in 100 batches of 10k each. A full 96% of them were duplicates. But here's the thing: they were all coming from 2 adjacent batches being exact duplicates of each other. My conclusion is that the seed to the random function did not change in-between the calls - most likely it updates only every 1 second. I repeated the test with a 3 second pause between the calls, and this time there were no duplicates. – michael.hor257k Feb 23 '23 at 08:40
  • That sounds exactly like my problem I know I've got a weird use case, but isn't it always? It could be okay for smaller use cases, but one should have this in mind when using it. Thank you for your time @michael.hor257k – BernhardS Feb 23 '23 at 10:27
1

There are some solutions based on languages and approaches for version 1.0 XSLTs.

Let sample XML is as below. (Sample XML is retrieved from https://www.cs.utexas.edu/~mitra/csFall2015/cs329/lectures/xml/xslplanes.2.xml.txt)

<?xml version = "1.0" encoding = "utf-8"?>
<planes xmlns="planes_from_cs_utexas_edu">
   <plane>
      <year> 1977 </year>
      <make> Cessna </make>
      <model> Skyhawk </model>
      <color> Light blue and white </color>
   </plane>
   <plane>
      <year> 1975 </year>
      <make> Piper </make>
      <model> Apache </model>
      <color> White </color>
   </plane>   
   <plane>
      <year> 1960 </year>
      <make> Cessna </make>
      <model> Centurian </model>
      <color> Yellow and white </color>
   </plane>
   <plane>
      <year> 1956 </year>
      <make> Piper </make>
      <model> Tripacer </model>
      <color> Blue </color>
   </plane>
</planes>

Since there is C++ need for the question, there is a solution like below.

1. Using Xalan C++ version (Seems suitable for the question)

There is an example code in C++ for this approach in https://xalan.apache.org/old/xalan-c/extensions.html. It simply shows square rooting but it can be converted into creating GUID as for example using CoCreateGuid() method in Windows or using libuuid in Linux environment and it can be returned as XObjectPtr as converting GUID into XalanDOMString.


For example if another languages would be used, then the solutions can be as below.

Java/.NET (Below examples are in Java but these approaches also can be applied on any .NET languages)
1. Using Reflexive Extension Functions (Based on Saxon)
Note: This solution applies to Saxon-PE and Saxon-EE only

XSLT can be as below including direct call to Java's UUID class' method.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:uuid="java:java.util.UUID"
xmlns:ns1="planes_from_cs_utexas_edu" 
exclude-result-prefixes="uuid">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:template match="/ns1:planes">
     <planes>
         <xsl:for-each select = "ns1:plane">
          <plane>
            <year>
                <xsl:value-of select="ns1:year" />
            </year>
            <make>
                <xsl:value-of select="ns1:make" />
            </make>
            <model>
                <xsl:value-of select="ns1:model" />
            </model>
            <color>
                <xsl:value-of select="ns1:color" />
            </color>
            <uuid>
                <xsl:value-of select="uuid:randomUUID()"/>
            </uuid>
       </plane>
       </xsl:for-each>
     </planes>
    </xsl:template>
</xsl:stylesheet>

The output will be:

<?xml version="1.0" encoding="utf-8"?>
<planes xmlns:ns1="planes_from_cs_utexas_edu">
   <plane>
      <year> 1977 </year>
      <make> Cessna </make>
      <model> Skyhawk </model>
      <color> Light blue and white </color>
      <uuid>50ef735f-a1a1-46cb-a638-05966b2c2b78</uuid>
   </plane>
   <plane>
      <year> 1975 </year>
      <make> Piper </make>
      <model> Apache </model>
      <color> White </color>
      <uuid>8e9b5345-445c-4700-8191-08731c44e1e0</uuid>
   </plane>
   <plane>
      <year> 1960 </year>
      <make> Cessna </make>
      <model> Centurian </model>
      <color> Yellow and white </color>
      <uuid>01b01db9-982a-4811-a5b3-efa73a39dacd</uuid>
   </plane>
   <plane>
      <year> 1956 </year>
      <make> Piper </make>
      <model> Tripacer </model>
      <color> Blue </color>
      <uuid>3a2f7ee2-c53c-46b5-903f-39a21990aa75</uuid>
   </plane>
</planes>

2. Using Integrated Extension Functions (Based on Saxon)
Note: This solution applies to all Saxon editions

See http://saxonica.com/html/documentation/extensibility/integratedfunctions/
Also there is an example usage in Saxon-HE Integrated Extension Functions | how and where?


C#

1. XSLT Stylesheet Scripting Using msxsl:script (Based on Microsoft processor)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ns1="planes_from_cs_utexas_edu" 
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:user="urn:my-scripts"
exclude-result-prefixes="uuid">
<msxsl:script language="C#" implements-prefix="user">  
     <![CDATA[  
     public double uuid()  
     {  
       return Guid.NewGuid().ToString(); 
     }  
      ]]>  
 </msxsl:script> 

<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:template match="/ns1:planes">
     <planes>
         <xsl:for-each select = "ns1:plane">
          <plane>
            <xsl:copy-of select="node()"/>
            <uuid>
                <xsl:value-of select="user:uuid()"/>
            </uuid>
       </plane>
       </xsl:for-each>
     </planes>
    </xsl:template>
</xsl:stylesheet>

The output will be similar to above sample output.

Reference: https://learn.microsoft.com/en-us/dotnet/standard/data/xml/xslt-stylesheet-scripting-using-msxsl-script

Erdem Savasci
  • 697
  • 5
  • 12
  • Do note that `msxsl-script` is not the standard mechanism for registering extension function. Also, the question is about libxslt and then the proper documentation is in http://xmlsoft.org/libxslt/extensions.html#Registerin1 – Alejandro Feb 05 '20 at 21:37