RE: [xsl] Very slow Xsl (just started in xsl)

Subject: RE: [xsl] Very slow Xsl (just started in xsl)
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 5 Jul 2007 22:27:52 +0100
> 	<xsl:template name="WriteData">
> 	     <xsl:for-each select="//data/*">
>                   <xsl:variable name="rij" select="position()"/>
> 	              <Row>
> 		           <xsl:for-each select="//layout/field">

I find it a little hard to believe that for every data element in the
document you want to process every field element in the document. What
exactly is the relationship between the two? I would expect to see some kind
of join condition. Clearly the time to process these nested loops is going
to depend on the product of the number of data elements and the number of
field elements: that is, it's O(n*m). 

>                        <xsl:sort select="vldidx" order="ascending"
> data-type="number"/>
>                        <xsl:if test="vldidx != 0">

If you don't want to process these elements then it's better to eliminate
them before sorting them:

      <xsl:for-each select="//layout/field[vlididx != 0]">

> 			         	   <xsl:call-template
name="GetFieldByVldidx">
> 
> 	<xsl:template name="GetFieldByVldidx">
>             <xsl:param name="rij"/>
>             <xsl:param name="col"/>
>             <xsl:param name="decimalen"/>
>             <xsl:param name="type"/>
> 
>             <xsl:variable name="poscol">
>                  <xsl:for-each select="//layout/field">
>                      <xsl:if test="vldidx = $col">
>                           <xsl:value-of select="position()"/>
>                      </xsl:if>
>                  </xsl:for-each>
>             </xsl:variable>

You're calling this template n*m times where n is the number of data
elements (or rather, the number of children of data elements) and m is the
number of fields. Now you go into another loop that selects all the fields,
so we're up to n*m*m. And what is this loop doing? It's finding the position
of the field whose vldidx is equal to $col. But in calling the template,
$col was set to the vldidx of the context node. So you are searching the
whole document to find the context node. But you know what the context node
is already - you can refer to it as ".". 

But now you are setting poscol not to the context node, but to its position
in the document.
> 
>             <xsl:for-each select="//data/*[position() = $rij]">
>                 <xsl:for-each select="*[position() = $poscol]">

So now you search the whole document for the children of data elements at
position $rij, and then you search its children for something at position
$poscol. If this works it's a miracle because the meaning of position() in
these two contexts is completely different.

What you seem to have completely missed in all of this is that you can set a
variable to a node, for example

<xsl:variable name="p" select="."/>

and then you can refer to this node directly. You don't have to remember its
position as a number and then search the document to find it again.

I'm afraid without knowing more about what you are trying to do, I can't
help you improve either the correctness or the performance of this code, but
the reason it's impossibly slow is very clear - I hope it's now clear to you
as well.

Michael Kay
http://www.saxonica.com/

Current Thread