RE: [xsl] Performance Question: Expensive Functions in Predicates

Subject: RE: [xsl] Performance Question: Expensive Functions in Predicates
From: "M. David Peterson" <m.david@xxxxxxxxxx>
Date: Thu, 27 May 2004 09:56:57 -0600
Hey Eliot,

The short answer to your question is a Boolean test is going to
eventually be performed on your XPath.  The fewer steps it takes to get
to that Boolean test the more efficient your code.  So if you can avoid
using conditional logic elements then you're reducing the number of
steps which in turn increases the efficiency of your code.  As far as
xsl:apply-templates/@select compared to xsl:template/@match both cases
are invoking the same Boolean test and so it really shouldn't matter
which one.  One way or the other the processor is going to break your
XPath into a subset of your XML data.  If that happens in the select or
match attribute it still happens the exact same way.

Deciding on which to use is a matter of deciding if there will be
multiple subsets of data that need to be matched.  If the statement
"foo/bar[@foobar = 'yes']" is the only test that needs to be made then
put it in the select attribute.  But if you have multiple possibilities
that need to be matched such as "foo/bar[@foobar = 'yes'] and
"foo/bar[@foobar = 'maybe'] then break it down to "foo/bar[@foobar]" for
your select statement and then have two templates, one with "bar[@foobar
= 'yes']" and the other as "bar[@foobar = 'maybe']" as the value of the
match attributes for each template.

If your conditional logic is a lot more complicated than this then you
may have no other choice than to use xsl:if or
xsl:choose[when][otherwise].  But if you can avoid it then simply decide
which case from above, select or match, matches your needs and go with
it.

Hope this helps get you to where you need to be!

Best of luck!

<M:D/>

> -----Original Message-----
> From: Eliot Kimber [mailto:ekimber@xxxxxxxxxxxxxxxxxxx]
> Sent: Thursday, May 27, 2004 9:36 AM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Performance Question: Expensive Functions in Predicates
> 
> I have a general question about predicting performance in the general
> case. I know that the best answer is "try it and see" but I'm
wondering
> if there's a general principle that can guide design in this
particular
> case.
> 
> In most of the work we do, which is processing technical documents to
> generate various outputs, we have to do applicability checks on pretty
> much every element to see if a particular element is applicable to the
> current processing conditions (target output, national language,
> computer platform, etc.). These applicability checks are fairly
> expensive computationally because they may need to investigate any
> number of properties of an element or its ancestry, neighbors, etc. It
> may also require the use of external extension functions and so on.
> 
> My question is where, in general, is the best place to use these
> functions:
> 
> - In apply-templates specifications?
> 
> - In match specifications?
> 
> - As IF blocks within templates?
> 
> For example, I could do this:
> 
> <xsl:apply-templates select="*[util:is_applicable()]"/>
> 
> Or
> 
> <xsl:template match="foo[util:is_applicable()]">
> 
> or
> 
> <xsl:template match="foo">
>    <xsl:if test="util:is_applicable()">
>    </xsl:if>
> </xsl:template>
> 
> I think that the IF approach ensures the fewest calls but also makes
the
> code more cluttered.
> 
> So I guess my question is: if not using the IF approach, would it be
> better to put the check in the apply-templates select or the match or
> does it matter or is it entirely a function of how a given XSLT
> implementation does its optimization?
> 
> Another option of course is to do the applicability processing as a
> separate step so that the base processing templates don't have to care
> about applicability. That would ensure that each element is only
> processed once for applicability but might introduce other performance
> or scalability issues since one would have to generate either a new
> serialized instance or a new result tree reflecting the input
> document(s). It would be a cleaner engineering solution as it would
mean
> base template writers wouldn't have to know about the need to do
> applicability checks.
> 
> Thanks,
> 
> Eliot
> --
> W. Eliot Kimber
> Professional Services
> Innodata Isogen
> 9030 Research Blvd, #410
> Austin, TX 78758
> (512) 372-8122
> 
> eliot@xxxxxxxxxxxxxxxxxxx
> www.innodata-isogen.com

Current Thread