Re: [xsl] problem: recursive templates slowing xalan processor down

Subject: Re: [xsl] problem: recursive templates slowing xalan processor down
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Tue, 11 May 2004 13:30:09 +0100
Hi Chris,

> What this pretty much does is select all the substream elements in
> the xml file, then read in a length ( part of a list) and take the
> sum over these values.

I know that you've solved your speed problem, but just a few quick
things about your template:

<xsl:template name="calcPart1">
  <xsl:param name="lastSUBStreams" /> 
  <xsl:choose>
    <xsl:when test="$lastSUBStreams">
      <xsl:variable name="firstBand" select="$lastSUBStreams[1]" />
      <xsl:variable name="totalOfRest">
        <xsl:call-template name="calcPart1">
          <xsl:with-param name="lastSUBStreams"
            select="$lastSUBStreams[count( . |$firstBand) !=
                                    count ($firstBand)]" />
        </xsl:call-template>
      </xsl:variable>
      <xsl:variable name="currentSize">2</xsl:variable>
      <xsl:variable name="length">
        <xsl:value-of select="number((substring-after(
                                        normalize-space(
                                          $firstBand/m:SUBSTREAM_interior),
                                        ' ')))" />
      </xsl:variable>
      <xsl:variable name="l">
        <xsl:if test="$length > $bitrate">
          <xsl:value-of select="$bitrate" />
        </xsl:if>
        <xsl:if test="$length <=$bitrate">
          <xsl:value-of select="$length" />
        </xsl:if>
      </xsl:variable>
      <xsl:value-of select="$totalOfRest + $currentSize+$l" />
    </xsl:when>
    <xsl:otherwise>0</xsl:otherwise>
  </xsl:choose>
</xsl:template>

First, you're selecting the "rest" of the $lastSUBStreams using the
XPath:

  $lastSUBStreams[count( . |$firstBand) != count ($firstBand)]

This selects all the $lastSUBStreams that aren't the $firstBand, by
cycling through each of them and testing whether a union between it
and $firstBand gives a node set containing the same number of nodes as
held by $firstBand. $firstBand is set using:

  $lastSUBStreams[1]

It always holds one node, namely the first node in the $lastSUBStreams
node-set. No other nodes aside from the first node can be that node,
since a node-set can't contain duplicates. A much more efficient way
of getting the "rest" of the $lastSUBStreams would be to just get
those whose position is greater than 1:

  $lastSUBStreams[position() > 1]

This is more efficient because it doesn't involve making lots of
node-sets and counting their length, and it means that the processor
could build the node-set lazily.

Second, you're doing a lot of setting variables using the content of
the <xsl:variable> element. When you're setting a variable to the
result of calling a template, like you are for the $totalOfRest
variable, you don't have any choice, but if you're just using an
XPath, such as in:

  <xsl:variable name="length">
    <xsl:value-of select="number((substring-after(
                                    normalize-space(
                                      $firstBand/m:SUBSTREAM_interior),
                                    ' ')))" />
  </xsl:variable>

then it's more efficient to use the select attribute. When you use the
content of an <xsl:variable> you create a small node tree (a result
tree fragment), and creating trees is a lot more work for an XSLT
processor than calculating values. So instead use:

  <xsl:variable name="length"
    select="number(substring-after(
                     normalize-space($firstBand/m:SUBSTREAM_interior),
                     ' '))" />

Third, if you have two <xsl:if>s that are exclusive (only one of them
can be true) then it's more efficient to use an <xsl:choose>, since
that way the XSLT processor only has to make as many tests as
necessary before one is found that's true, rather than testing every
<xsl:if>, even those that can't be true (since a prior <xsl:if> was
true). So instead of:

        <xsl:if test="$length > $bitrate">
          <xsl:value-of select="$bitrate" />
        </xsl:if>
        <xsl:if test="$length <=$bitrate">
          <xsl:value-of select="$length" />
        </xsl:if>

use:

  <xsl:choose>
    <xsl:when test="$length > $bitrate">
      <xsl:value-of select="$bitrate" />
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$length" />
    </xsl:otherwise>
  </xsl:choose>

Finally, and this is the most important rewrite, it's much better to
write a recursive template as a tail-recursive template if you can (or
as a divide-and-conquer template if there are hundreds of recursions
involved each time the template is called). A tail-recursive template
calls itself as the very last thing that it does. You need to have an
extra parameter to keep track of the current sum, and update this
parameter on each recursion. Here's what your template looks like when
it's rewritten using the advice above and as a tail-recursive
template:

<xsl:template name="calcPart1">
  <xsl:param name="lastSUBStreams" />
  <xsl:param name="total" select="0" />
  <xsl:choose>
    <xsl:when test="$lastSUBStreams">
      <xsl:variable name="firstBand" select="$lastSUBStreams[1]" />
      <xsl:variable name="currentSize" select="2" />
      <xsl:variable name="length"
        select="number(substring-after(
                        normalize-space($firstBand/m:SUBSTREAM_interior),
                        ' '))" />
      <xsl:variable name="l">
        <xsl:choose>
          <xsl:when test="$length > $bitrate">
            <xsl:value-of select="$bitrate" />
          </xsl:when>
          <xsl:otherwise>
            <xsl:value-of select="$length" />
          </xsl:otherwise>
        </xsl:choose>
      </xsl:variable>
      <xsl:call-template name="calcPart1">
        <xsl:with-param name="lastSUBStreams"
                        select="$lastSUBStreams[position() > 1]" />
        <xsl:with-param name="total"
                        select="$total + $currentSize + $l" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$total" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Actually, I doubt that any of the above will make a significant
difference if you're dealing with only 15 recursions, but it's all
good practice :)

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Current Thread