[xsl] XPath "//", speed, and Saxon

Subject: [xsl] XPath "//", speed, and Saxon
From: Lars Huttar <huttarl@xxxxxxxxx>
Date: Fri, 31 Oct 2008 09:27:48 -0500

I was recently trying to solve performance problems in an XSLT-heavy web
application, and came up against results that puzzled me with regard to
XSLT optimization.

We have a Cocoon pipeline in which about 5MB of XML data is being fed
through a particular XSLT stylesheet (one in a series). And I thought
that this stylesheet was the reason for the pipeline taking forever to
run. I looked in it and found several uses of XPaths containing an
initial double-slash, e.g. select="//foo", some of them being invoked
multiple times.

I figured that for a simple XSLT processor, each "//foo" expression
could mean traversing the whole input DOM again, which would be
expensive for a big input.

So I went through and converted the "//foo" expressions to use keys.
Excited at how much faster I expected the stylesheet to run, I ran some
tests ... pretty fast. The process completed in just under 2 seconds.
But then I ran an apples-to-apples test on the old version of the
stylesheet, the one with lots of "//foo" in it. And to my surprise, the
old version ran just as fast. After several test runs I could see no
appreciable difference in speed.

Obviously the performance problem was elsewhere. But the question I
wanted to ask here is, what does this imply regarding good practices for
writing efficient stylesheets?

Saxon of course is not a dumb XSLT processor. Maybe it compiles the
"//foo"-like XPath expressions into something like keys without being
told to... e.g. it indexes the DOM tree by element name... and so you
get good performance with those expressions even on large inputs.

If so, does that optimization rely on the name of the element, so that
it would not apply to expressions like "//*[...]"? That would suggest
that for "//foo"-like expressions, you're in good shape, but for
expressions like "//*" you should use a key for efficiency.

Thanks for any help and advice.


Current Thread