Re: [xsl] XPath "//", speed, and Saxon

Subject: Re: [xsl] XPath "//", speed, and Saxon
From: "Mukul Gandhi" <gandhi.mukul@xxxxxxxxx>
Date: Fri, 31 Oct 2008 22:22:57 +0530
here are some general performance tips for stylesheet writing,

http://xml.apache.org/xalan-j/faq.html#faq-N10175

my personal opinion is, that using // near the root of a large tree is
quite expensive. you should try to replace //x with more specific
XPath paths, like /a/b/c/x

On Fri, Oct 31, 2008 at 7:57 PM, Lars Huttar <huttarl@xxxxxxxxx> wrote:
> Hello,
>
> I was recently trying to solve performance problems in an XSLT-heavy web
> application, and came up against results that puzzled me with regard to
> XSLT optimization.
>
> We have a Cocoon pipeline in which about 5MB of XML data is being fed
> through a particular XSLT stylesheet (one in a series). And I thought
> that this stylesheet was the reason for the pipeline taking forever to
> run. I looked in it and found several uses of XPaths containing an
> initial double-slash, e.g. select="//foo", some of them being invoked
> multiple times.
>
> I figured that for a simple XSLT processor, each "//foo" expression
> could mean traversing the whole input DOM again, which would be
> expensive for a big input.
>
> So I went through and converted the "//foo" expressions to use keys.
> Excited at how much faster I expected the stylesheet to run, I ran some
> tests ... pretty fast. The process completed in just under 2 seconds.
> But then I ran an apples-to-apples test on the old version of the
> stylesheet, the one with lots of "//foo" in it. And to my surprise, the
> old version ran just as fast. After several test runs I could see no
> appreciable difference in speed.
>
> Obviously the performance problem was elsewhere. But the question I
> wanted to ask here is, what does this imply regarding good practices for
> writing efficient stylesheets?
>
> Saxon of course is not a dumb XSLT processor. Maybe it compiles the
> "//foo"-like XPath expressions into something like keys without being
> told to... e.g. it indexes the DOM tree by element name... and so you
> get good performance with those expressions even on large inputs.
>
> If so, does that optimization rely on the name of the element, so that
> it would not apply to expressions like "//*[...]"? That would suggest
> that for "//foo"-like expressions, you're in good shape, but for
> expressions like "//*" you should use a key for efficiency.
>
> Thanks for any help and advice.
>
> Lars


-- 
Regards,
Mukul Gandhi

Current Thread