-2

It's more of a clarification that I am in need ..

as per this answer on a question, XSLT variables are cheap! My question is: Is this statement valid for all the scenarios? The instant variables which get created and get destroyed withing 4 line code aren't bothersome but loading a root node or child entities, in my opinion is indeed bad practice..

I have two XSLT files, designed for same input and output requirement:

XSLT1 (without unnecessary variable):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/">
        <Collection>
            <xsl:for-each select="CATALOG/CD">
                <DVD>
                    <Cover>
                        <xsl:value-of select="string(TITLE)"/>
                    </Cover>
                    <Author>
                        <xsl:value-of select="string(ARTIST)"/>
                    </Author>
                    <BelongsTo>
                        <xsl:value-of select="concat(concat(string(COUNTRY), ' '), string(COMPANY))"/>
                    </BelongsTo>
                    <SponsoredBy>
                        <xsl:value-of select="string(COMPANY)"/>
                    </SponsoredBy>
                    <Price>
                        <xsl:value-of select="string(number(string(PRICE)))"/>
                    </Price>
                    <Year>
                        <xsl:value-of select="string(floor(number(string(YEAR))))"/>
                    </Year>
                </DVD>
            </xsl:for-each>
        </Collection>
    </xsl:template>
</xsl:stylesheet>

XSLT2 (with unnecessary variable "root" in which whole XML is loaded):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/">
    <xsl:variable name="root" select="."/>
        <Collection>
            <xsl:for-each select="$root/CATALOG/CD">
                <DVD>
                    <Cover>
                        <xsl:value-of select="string(TITLE)"/>
                    </Cover>
                    <Author>
                        <xsl:value-of select="string(ARTIST)"/>
                    </Author>
                    <BelongsTo>
                        <xsl:value-of select="concat(concat(string(COUNTRY), ' '), string(COMPANY))"/>
                    </BelongsTo>
                    <SponsoredBy>
                        <xsl:value-of select="string(COMPANY)"/>
                    </SponsoredBy>
                    <Price>
                        <xsl:value-of select="string(number(string(PRICE)))"/>
                    </Price>
                    <Year>
                        <xsl:value-of select="string(floor(number(string(YEAR))))"/>
                    </Year>
                </DVD>
            </xsl:for-each>
        </Collection>
    </xsl:template>
</xsl:stylesheet>

Approach-2 exists in realtime and infact the XML would be several KBs to few MBs, In XSLT usage of variables is extended to child entities as well..
To put-forth my proposal to change the approach, I need to verify the theory behind it..

As per my understanding incase of approach-2, system is reloading the XML data over and over in memory (incase of usage of multiple variables to load child entities the situation turns worst) and thereby slowing down the transformation process.

Before posting this question here I tested the performance of two XSLTs using timer. First approach takes few milliseconds lesser than approach-2. (I used copy-XML files to test two XSL files to avoid complexity with system cache). But again system cache might play huge confusing role here ..

Despite of this analysis of mine I still have a question in mind! Do we really need to avoid usage of variables. And as far as my system is concerned, how worthy is it to modify the realtime XSLT files, so as to use 'approach-1'?

OR Is it like XSLT variables are different than other programming languages (Incase if I'm not aware) .. Say for example, XSLT variables don't actually store the data when you do select="." but they kind of point to the data! or something like this..? AND HENCE continue using XSLT variables without hesitation..

What is your suggestion on this?

Quick Info on current system:

  1. Host Programming Language or System: Siebel (C++ is the backend code)
  2. XSLT Processor: Xalan (Unless Saxon is used explicitely)
Community
  • 1
  • 1
Enthusiastic
  • 549
  • 4
  • 8
  • 23

1 Answers1

0

I agree with the comments made that you need to measure performance with your particular XSLT processor.

But your descriptions or expectations like "approach-2, system is reloading the XML data over and over in memory" seem wrong to me. The XSLT processor builds an input tree of the primary input XML document anyway and I can't imagine that any implementation then with <xsl:variable name="root" select="."/> does anything like loading the document completely again, it would even be wrong, as node identity and generate-id would not work. The variable will simply keep a reference to the document node of the existing input tree.

Of course in your sample where you have a single input document and a single template where the current node is the document anyway the use of the variable you have is superfluous. But there are cases where you need to store the document node of the primary input document, in particular when you deal with multiple documents.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • okay .. so is there a reason why the timer returned higher value (always) for the one with variable than the other .. I tried all all probabilities to deal with system cache, (like different XMLs with same size but different data and names for two different XSLTs.. also different sequence I performed for testing ) yet approach one was faster than 2 by few Milli seconds – Enthusiastic Jul 28 '13 at 14:38
  • Someone with knowledge of the Xalan code base might be able to explain the difference or a profiler for the code might tell you where time is consumed. – Martin Honnen Jul 28 '13 at 15:15