Re: [xsl] Recursive grouping - simple, XSLT 1.0, fast non-Muenchian grouping method

Subject: Re: [xsl] Recursive grouping - simple, XSLT 1.0, fast non-Muenchian grouping method
From: Dimtre Novatchev <dnovatchev@xxxxxxxxx>
Date: Tue, 21 Dec 2004 06:30:10 +1100
Hi Sergiu,

Yes, I'd very much appreciate it if you send me the source xml files
you're using in the experiments.

However, please, note that having only two levels of grouping is not
representative enough for benchmarking this algorithm.
  Why not add at least 2-3 levels more, such as publisher and translations?

Another observation is that any challenging source xml document will
not have all possible key combinations appear at the very start of the
document. The most useful for benchmarking will be a document where at
least some (if not all) of the key values/combination appear at the
very end of the document.


Cheers,
Dimitre.


On Mon, 20 Dec 2004 19:47:07 +0200, Sergiu Ignat <sergiu.ignat@xxxxxxxx> wrote:
> Hello Dimitre,
> 
> The presented algorithm works much faster than Muenchian grouping without
> xsl:key for a large xml of about 720 elements that sould be grouped in
> groups of 1 to 5 elements per group, up to 5 times faster in that particular
> case.(If you want I can send you the input samples, they are too large to be
> sent to the group). It works slower than Muenchian method with xsl:key and
> generate-id(). The same recursive grouping can be implemented using xsl:key.
> It is much faster than the previous version but it is still a bit slower
> than Muenchian method with xsl:key and generate-id().
> 
> The advantage of recursive grouping is that it is possible to define
> grouping rules that are much more complex than one accepted by the use
> attribute of xsl:key or by generate-id() function. For example if in a list
> of items that have attributes price and quantity the grouping should be done
> by price*quantity.
> 
> Using any of this 2 methods it is also possible to generate a sorted output.
> In that case instead of selecting the grouping identifier of the first
> element in the list it is necessary to select the min or max grouping
> identifier value.
> 
> We are using XSL for reporting, and sometimes grouping rules are very
> complex.
> 
> A recoursive grouping with xsl:key is presented below.
> 
> <?xml version="1.0" encoding="UTF-8" ?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> <xsl:key name="title-search" match="book" use="@author" />
> <xsl:template match="/booklist">
> 
>  <authors>
>   <xsl:call-template name="RecursiveGrouping">
>    <xsl:with-param name="list" select="book"/>
>   </xsl:call-template>
>  </authors>
> </xsl:template>
> 
> <xsl:template name="RecursiveGrouping">
>  <xsl:param name="list"/>
> 
>  <!-- Selecting the first author name as group identifier and the group
> itself-->
>  <xsl:variable name="group-identifier" select="$list[1]/@author"/>
>  <xsl:variable name="group"
> select="key('title-search',$group-identifier)"/>
> 
>  <!-- Do some work for the group -->
>  <author name="{$group-identifier}" number-of-books="{count($group)}"/>
> 
>  <!-- If there are other groups left, calls itself -->
>  <xsl:if test="count($list)>count($group)">
>  <xsl:call-template name="RecursiveGrouping">
>    <xsl:with-param name="list"
> select="$list[not(@author=$group-identifier)]"/>
>  </xsl:call-template>
>  </xsl:if>
> </xsl:template>
> </xsl:stylesheet>
> 
> the xml input is:
> 
> <booklist>
>   <book author="Frank Herbert" title="Dune"/>
>   <book author="Roberto Quaglia" title="Bread, Butter and Paradoxine"/>
>   <book author="Kate Wilhelm" title="Where Late the Sweet Bird Sang"/>
>   <book author="Anthony Burgess" title="A Clockwork Orange"/>
>   <book author="Frank Herbert" title="Dragon in the Sea"/>
>   <book author="Anthony Burgess" title="Earthly Powers"/>
>   <book author="Isaak Asimov" title="The Foundation Trilogy"/>
>   <book author="Frank Herbert" title="Children of Dune"/>
>   <book author="Isaak Asimov" title="The Caves of Steel"/>
> </booklist>
> 
> ----- Original Message -----
> From: "Dimtre Novatchev" <dnovatchev@xxxxxxxxx>
> To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
> Cc: "Sergiu Ignat (dyve)" <sergiu.ignat@xxxxxxxx>
> Sent: Saturday, December 18, 2004 12:49 AM
> Subject: Re: [xsl] Recursive grouping - simple, XSLT 1.0, fast non-Muenchian
> grouping method
> 
> On Thu, 16 Dec 2004 19:27:46 +0200, Sergiu Ignat (dyve)
> <sergiu.ignat@xxxxxxxx> wrote:
> > Hello everybody.
> >
> > I would like to present you a simple, XSLT 1.0, fast grouping method with
> a
> > O(N*log(N)) complexity, the same as sorting. The only grouping method I
> knew
> > so far is Muenchian that has O(N^2) complexity.
> 
> The complexity of the Muenchian method is much better than O(N^2) --
> can you provide an example with representative data, where the
> Muenchian method behaves O(N^2)?

Current Thread