7

I want to transform an XML document into HTML. Some XML elements have links to others documents like:

<link href="1.html">

In the HTML output, I want to get:

<a href="1.html&no_cache={unique_id}">

How can I generate this unique fairly large ID?

Nawa
  • 2,058
  • 8
  • 26
  • 48

4 Answers4

6

To start with, I assume that due to some unknown reason you cannot use the absolute URL in the link as the required UID -- this is the simplest and most natural solution.

In case my assumption is correct, then:

This is an easy task for XSLT.

Because the OP wants the generated ids to be the same when the transformation is performed several times, it isn't appropriate to use the generate-id() function.

Here is one simple way of producing stable ids:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="link[@href]">
  <xsl:variable name="vUid">
    <xsl:number level="any" count="link[@href]"/>
  </xsl:variable>
   <a href="{@href}&amp;no_cache={{{$vUid}}}"/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the following XML document (regardless how many times):

<t>
 <link href="1.html"/>
 <a>
   <link href="2.html"/>
  <b>
    <link href="3.html"/>
    <c>
     <link href="4.html"/>
    </c>
    <link href="5.html"/>
  </b>
  <link href="6.html"/>
  <d>
   <link href="7.html"/>
  </d>
 </a>
 <link href="8.html"/>
 <e>
  <link href="9.html"/>
 </e>
 <link href="10.html"/>
</t>

the wanted, same, correct result is produced every time:

<t>
   <a href="1.html&amp;no_cache={1}"/>
   <a>
      <a href="2.html&amp;no_cache={2}"/>
      <b>
         <a href="3.html&amp;no_cache={3}"/>
         <c>
            <a href="4.html&amp;no_cache={4}"/>
         </c>
         <a href="5.html&amp;no_cache={5}"/>
      </b>
      <a href="6.html&amp;no_cache={6}"/>
      <d>
         <a href="7.html&amp;no_cache={7}"/>
      </d>
   </a>
   <a href="8.html&amp;no_cache={8}"/>
   <e>
      <a href="9.html&amp;no_cache={9}"/>
   </e>
   <a href="10.html&amp;no_cache={10}"/>
</t>

Do note: The use of <xsl:number> to produce the id.

If the same link can occur several times in the document and we need all occurences to use the same id, here is the solution for this problem:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kHrefByVal" match="link/@href" use="."/>

 <xsl:variable name="vUniqHrefs" select=
  "//link/@href
       [generate-id()
       =
        generate-id(key('kHrefByVal',.)[1])
       ]
  "/>


 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="link[@href]">
  <xsl:variable name="vthisHref" select="@href"/>

  <xsl:variable name="vUid">
   <xsl:for-each select="$vUniqHrefs">
    <xsl:if test=". = $vthisHref">
     <xsl:value-of select="position()"/>
    </xsl:if>
   </xsl:for-each>
  </xsl:variable>
   <a href="{@href}&amp;no_cache={{{$vUid}}}"/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the following XML document:

<t>
 <link href="1.html"/>
 <a>
   <link href="2.html"/>
  <b>
    <link href="1.html"/>
    <c>
     <link href="3.html"/>
    </c>
    <link href="2.html"/>
  </b>
  <link href="1.html"/>
  <d>
   <link href="3.html"/>
  </d>
 </a>
 <link href="4.html"/>
 <e>
  <link href="2.html"/>
 </e>
 <link href="4.html"/>
</t>

the wanted, correct result is produced:

<t>
   <a href="1.html&amp;no_cache={1}"/>
   <a>
      <a href="2.html&amp;no_cache={2}"/>
      <b>
         <a href="1.html&amp;no_cache={1}"/>
         <c>
            <a href="3.html&amp;no_cache={3}"/>
         </c>
         <a href="2.html&amp;no_cache={2}"/>
      </b>
      <a href="1.html&amp;no_cache={1}"/>
      <d>
         <a href="3.html&amp;no_cache={3}"/>
      </d>
   </a>
   <a href="4.html&amp;no_cache={4}"/>
   <e>
      <a href="2.html&amp;no_cache={2}"/>
   </e>
   <a href="4.html&amp;no_cache={4}"/>
</t>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • +1 Good answer. In the second you could use `` for better performance. –  Dec 22 '10 at 23:35
3

try generate-id().

<xsl:value-of select="generate-id(.)"/>

here is a further explanation: http://www.w3schools.com/XSL/func_generateid.asp

Stephan Schinkel
  • 5,270
  • 1
  • 25
  • 42
  • generate-id(.) generates no unique IDs if I trying process this xml several times. I want to have unique ID each time – Nawa Dec 22 '10 at 13:47
  • @Nawa: `generate-id()` function guarantee a unique identifier for each node in the input source **during the same transformation**. If you want a unique identifier for all times, then you need to implement some algorithm like MD5 –  Dec 22 '10 at 13:54
  • hi nawa. it should generate new ids everytime you call the transformation. but there is afaik no standard for the generation of these ids. w3c just says this id is unique over all nodes in the current transformation. your best bet for truly unique ids would be to use something like calling .net methods withing your xsl transformation and return Guid.NewGuid() from within the .net assembly. or alternatively just transform and to #UNIQUEID# (static text) and replace every occurence of #UNIQUEID# afterwards with a language of your choice with a unique identifier. – Stephan Schinkel Dec 22 '10 at 13:57
2

It's not possible with pure XSLT, but some alternative options might be:

  1. Add an extension namespace so that you can call out to non-XSLT code: <a href="1.html&no_cache={myns:unique_id()}">. This will give you the result you're after, but does depend on support from the framework you're using to perform the transformation.
  2. Use JavaScript to add the unique ID to the links on the client. Only works if your client has JavaScript enabled, but may be an acceptable compromise if you know this will be the case.
  3. Set the HTTP headers on your pages to prevent caching. Probably the best option from a semantic point of view, and you won't run the risk of search engines repeatedly crawling your page with each unique ID.
MarkXA
  • 4,294
  • 20
  • 22
0

XSLT is a functional languages which means for a given input it will always product the same output, so by definition a guid method or any other random generator would not be part of the design spec. Your best bet if you're client bound is to use a time-related method as part of a pseudo-random seed for generate-id, however as your goal appears to be strong decaching you should abandon this and just focus on applying the correct anti-cache headers to the resources you're trying to protect.

annakata
  • 74,572
  • 17
  • 113
  • 180