RE: [xsl] xsl:for-each-group: start groups depending on number of group members?

Subject: RE: [xsl] xsl:for-each-group: start groups depending on number of group members?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Mon, 30 Apr 2007 14:28:03 +0100
It's such a high-level description of the problem that it's hard to be
specific about how to tune the performance, but instinctively my reaction
would be to look for a multi-pass approach: preprocess the data to compute
properties of each node that will make the subsequent grouping operation
simpler and more efficient.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: Yves Forkl [mailto:Y.Forkl@xxxxxx] 
> Sent: 30 April 2007 14:13
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [xsl] xsl:for-each-group: start groups depending 
> on number of group members?
> 
> Wendell,
> 
> you wrote:
> 
> > While you can't restrict preceding-sibling to look only at 
> members of 
> > the current group, you might be able to get somewhere with 
> either of 
> > these approaches:
> > 
> > * The XPath 2.0 "intersect" operator can return those 
> members common 
> > to two sequences of nodes, so (preceding-sibling::node() intersect
> > current-group()) will return just those members of the 
> current group 
> > that are on the preceding-sibling axis relative to the context.
> 
> Thank you very much for this hint! The intersection of the 
> group members and those not having a preceding sibling of a 
> specific sort is what I was looking for. This makes my demo 
> template look like:
> 
> <xsl:template match="B" mode="groups_at_root_level">
>    <B_new>
>      <xsl:variable name="this_group" select="current-group()"/>
>      <xsl:for-each-group
>        select="$this_group"
>        group-starting-with="
>          B|sub[not($this_group intersect preceding-sibling::A)]">
>        <xsl:apply-templates select="current-group()"/>
>      </xsl:for-each-group>
>    </B_new>
> </xsl:template>
> 
> 
> > * If, rather than using grouping constructs to select from 
> the nodes 
> > in the source, you processed them into temporary trees, you could 
> > construct those trees exactly the way you wanted, including nesting 
> > elements in such a way that preceding-sibling would be 
> useful. Such as:
> > 
> > <xsl:variable name="intermediate">
> >   <xsl:for-each-group select="*" group-by=".">
> >     <group>
> >       <xsl:copy-of select="current-group()"/>
> >     </group>
> >   </xsl:for-each-group>
> > <xsl:variable>
> > 
> > <xsl:for-each select="$intermediate/group">
> >   ... inside each group element, members of the group appear as 
> > siblings ...
> > </xsl:for-each>
> 
> That seems to be a neat approach, too, at least from a 
> general point of view. However, in my case, the existence of 
> preceding siblings is important for determining whether an 
> item is allowed to start a group or not. So "unconditionally" 
> starting a group on any instance of an element would yield a 
> number of groups that would have to be resolved afterwards 
> into members of other groups, because only looking at the 
> siblings of the group starter will reveal that in fact it 
> should not have fulfilled this role. xsl:for-each on "group" 
> instances would then be quite
> difficult: you can't process any group after you have 
> processed them all, because you need to make sure that you 
> don't miss any "late" member from a group that had to be resolved...
> 
> Unless you have "unstable" groups, this approach is 
> definitely very interesting.
> 
> 
> > But I'm not sure either of these are actually necessary 
> here. You have 
> > only presented your problem in fragmentary form, so it's 
> hard to say; 
> > but to get the result you say you want, I'd do something 
> much simpler:
> > 
>  > [snip]
> 
> Thank you (as well as Andrew) for proposing simple and 
> elegant solutions 
> that accomplish the basic grouping task. Unfortunately, I 
> can't use them 
> because the grouping I'm doing is far more complicated. (E.g. 
> repeated 
> grouping based on the same element; grouping highly depends 
> on preceding 
> instances; dynamic creation of multiple group containers 
> etc.) Trying to 
> leave out the less relevant details, I crafted a demo that would just 
> show my minimal requirements, however strange they might 
> seem. Sorry for 
> the confusion.
> 
> Let me be more elaborate on my grouping criteria. Rather than just 
> matching an element I always need it to meet some condition, 
> so instead of:
> 
> group-starting-with="
>    B|sub[not($this_group intersect preceding-sibling::A)]"
> 
> I actually have more something like:
> 
> group-starting-with="
>    B|sub[$condition1 and
>          not($this_group intersect
>              preceding-sibling::A[$condition2])]"
> 
> What I am curious about is how I could optimize my stylesheet runtime 
> behaviour (I'm using Saxon 8.8) by computing some values only 
> once, e.g. 
> using a variable declared before xsl:for-each-group, given that:
> 
> - the negated expression appears several times within the attribute 
> value (think of it like duplicating the above code for "sub"), while 
> $condition1 is rather singular
> 
> - the number of instances matching unconstrained 
> preceding-sibling::A[$condition2] is rather large, whereas within the 
> grouping candidates it is small or zero
> 
> - the value of preceding-sibling::A[$condition2] depends, as far as I 
> have understood, on the item that xsl:for-each-group 
> currently examines, 
> so it can't sensibly be evaluated beforehand
> 
> Any ideas on this?
> 
>    Yves

Current Thread