Re: [xsl] bad programming for speedup?

Subject: Re: [xsl] bad programming for speedup?
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Tue, 24 Jul 2007 13:49:16 +0200
Justin Johansson wrote:
Abel,

ouch, this hurts my eyes ;)

It hurts my eyes too but, IMHO, there is no such thing as a silly question
as the list has now received several well-informed responses, including
yours, that make for very interesting discussion (not that I am suggesting
that you were berating the question in any way whatsoever).

I was under the impression that the OP already found that his style was "bad" and I just meant to emphasize that fact, not that he/she was making a silly question, because, indeed, there is no such thing. I'm a part-time Dutch teacher (teaching foreigners Dutch) and as such you realize quickly that what may look silly to someone knowing the language is damn serious (and intricate) for the one not knowing the language. Same, obviously, applies to computer languages.


Anyway, since you proposed what looks like an excellent solution, I'd
like to ask

1.  If anything would be achieved anything performance-wise by
splitting that copy-of-union into two instructions, viz.,

	<xsl:copy-of select=". " />
	<xsl:copy-of select="following-sibling::row" />

and thereby saving the processor the potential overhead of
having to do doc-order comparison on nodes in calculating
the union.

From previous discussions that I vaguely remember (don't ask for the link) I have learned (correctly or incorrectly?) that the union is very efficient when in document-order. Also, it may very well be that internally, after tokenization or whatever process takes place behind the scenes, any processor is free to optimize the above as a union, or the union as several copy-or instructions. In which case, it is likely that either performs equally well.


Of course, it all depends on the processor used.

Further in the general case of

copy-of select="$a | $b | $c"

when one already knows that $a nodes precede $b nodes
which, in turn, precede $c nodes in document order, is it more
efficient to split these into individual copy operations, viz.

copy-of select="$a "
copy-of select="$b "
copy-of select="$c"

see my statement above. Furthermore, from my own tests, I found only significant performance troubles with complex micro-pipelining where the internal "micro trees" became rather large and could apparently not easily be optimized.


Come to think of it: the other situation that is performance critical is deep recursion that is not tail recursion and as such cannot be optimized (meaning: stack will grow). But this is more a problem for XSLT 1.0 then it is for 2.0.

2.  Apart the fact that xsl:copy-of has some additional special-purpose
attributes over and above xsl:sequence,  is there any discernable
difference between xsl:copy-of and xsl:sequence.  Accordingly, in your
solution having the following instead would have achieved exactly the
same result and with exactly the same performance?

<xsl:sequence-of select=". | following-sibling::row" />

I was under the impression that the OP used XSLT 1.0 (but he/she didn't state so specifically), in which case the obvious answer is: you can only use the second.


I think that Michael Kay has made several statements in favor of xsl:sequence (but that is mostly in favor of xsl:value-of, which is more a clear-cut difference). When comparing the two when all you are doing is copying source nodes to the result tree, I think there is not much difference (definitely not in terms of speed). If you want to validate (with XSLT-SA) it is probably easier to use xsl:copy-of with the validate attribute.

If you want to remove the namespace nodes in one call, all you can do is use the xsl:copy-of. But I am under the impression that all these differences are largely decorative and that it is a matter of taste which approach you choose.

Of course, the biggest difference is the mutability: xsl:sequence can contain a body (sequence constructor, to be precise), whereas xsl:copy-of cannot (it is an atomic operation and therefor possibly faster in some circumstances). This translates in practice to situations where you find yourself refactoring code and removing xsl:copy-of with xsl:copy and the modified copy template and/or xsl:sequences. In larger projects, or anything that *may* change (and we talk *soft*ware here, people tend to forget the *soft*ness of their ware ;) I usually recommend against using xsl:copy-of and favor the copy idiom.

Cheers,
-- Abel Braaksma

Current Thread