RE: [xsl] for-each-group and result-document splitting to less files.

Subject: RE: [xsl] for-each-group and result-document splitting to less files.
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 10 Sep 2008 18:27:31 +0100
> -- though makes me curious... how hard would it be to change 
> this so that the groupings were based on the count() of items 
> in the nodeset?
> i.e. make divisions that were basically equal regardless of the input?
>  So if there were almost no entries in the first half of the 
> alphabet, the first division would be a-r, then s-t (if a lot 
> there), then u-w (if few there), etc.  maybe passing how many 
> sets to split it into total?

If the sorted sequence is $in, and you want to split it roughly into $n
groups, then you can take the size of a group as ceiling(count($in) div $n).
Then the letters that form the starts of groups might have initial letters

$initialLetters := 
distinct-values('a', for $i in 1 to $n return
substring($in[ceiling(count($in) div $n)]/@title, 1, 1))

The next step is to construct the translation table such as
'aaaaaaffffffsssssvvvvv' by replacing each letter in a-z with the highest
letter from $initialLetters that is <= the letter in question, that is:

$alphabet :=
   for $i in 1 to 26 return substring('abcdefghijklmnopqrstuvwxyz', $i, 1)

$transTable :=
string-join(for $c in $alphabet return max($initialLetters[. le $c]), '')
           

Not tested, of course.

Michael Kay
http://www.saxonica.com/

Current Thread