Re: [xsl] Grouping (yet again)

Subject: Re: [xsl] Grouping (yet again)
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Thu, 21 Feb 2002 09:49:52 +0000
Hi Steven,

> ps: question aside: does anyone has an idea on the efficiency of
> this multi-level grouping thing? Both in terms of memory usage
> building up these three massive keys, and for-eaching across all of
> them. There's a good chance this transformation will be called
> (inside a Cocoon pipeline) quite often, and while there's cacheing
> inside Cocoon, a performant stylesheet is a good starter.

Grouping using XSLT is never going to be as efficient as starting off
with pre-grouped XML and just processing that, which is why if you
*can* do the grouping prior to the transformation, so much the better.
If you're grouping the same XML over and over again, you might want to
think about separating the stylesheet into two - one doing the
grouping and one doing the formatting of the groups. That way Cocoon
can cache the result of the grouping, and run different formatting on
those groups on demand (but then, I think you're doing that already).

Grouping using the Muenchian Method is likely to be a lot more
efficient than using preceding-sibling::, especially with the masses
of data that you're talking about, although the other alternative that
you could try is splitting the process into sorting (by group, year &
week, and dir) and then grouping, since with sorted data you can just
test the immediately preceding sibling datum to see whether you need
to start a new group or not, which should lead to less nodes being
visited. (But then the sorting is not particularly efficient, unless
you can cache the result.)

But really the answer to any performance question is 'try it and see'.
The efficiency of a stylesheet has a lot to do with the optimisations
built into the processor that you're using and the actual data set
that you're processing. So while it's possible to make general
assertions, the best advice is to test on your own system. For what
it's worth, I recently found CatchXSL! (http://www.xslprofiler.org/)
to be helpful at isolating processor-intensive portions of code.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread