RE: [xsl] Sibling axis: All of kind A up to first of kind B

Subject: RE: [xsl] Sibling axis: All of kind A up to first of kind B
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 26 Mar 2008 10:10:08 -0000
>        <xsl:apply-templates select="following-sibling::h2[
>          generate-id(preceding-sibling::h1[1]) =
>          generate-id(current()) ]"/>
> 
> This works fine. But is it efficient? Or is there a better 
> way to do this in XSL 1.0? If there were zillions of those 
> headers, would the processor then have to backtrack at each 
> h2 in order to find the first h1 in reverse document order 
> and do the comparison? Or are implementations required to be 
> smarter than that so the user doesn't have to care about 
> these details?

No, implementations are not required to be smart. I think that any
respectable implementation(*) is likely to implement the predicate [1] with
a modicum of intelligence, that is, to stop scanning when the first node is
reached. Anything else is going to vary widely from one product to another.

(*) Actually, implementations are not required to be respectable either.

So this code is likely, for each element, to (a) scan forwards to all the
following siblings, and (b) for each following-sibling::h2, to scan
backwards to the most recent h1.

> 
> How could I instruct the processor most efficiently to process all
> h2 headers but only so long as there is no h1 header 
> encountered on the way, the way being the following-sibling axis?

Using XSLT 2.0: <xsl:for-each-group group-starting-with>!

But in XSLT 1.0 it requires sibling recursion, see below
> 
> The other day, "sibling recursion" was mentioned on this list.
> Does the following qualify as sibling recursion - and is this 
> likely to be more efficient for large input?
> 
>        <xsl:apply-templates mode="toc"
>          select="(following-sibling::h1 | 
> following-sibling::h2)[ 1 ]"/>

Might be safer to do following-sibling::*[self::h1 or self::h2][1] - but
depends on the processor.

I think this is likely to be more efficient. But with large input, it will
run out of stack space (typically after 500 or so calls) unless the
processor implements tail-call optimization, which not all do.

Michael Kay
http://www.saxonica.com/

Current Thread