Re: [xsl] for-each-group grouping accented versions of letters together

Subject: Re: [xsl] for-each-group grouping accented versions of letters together
From: Graydon <graydon@xxxxxxxxx>
Date: Sat, 21 Apr 2012 10:36:11 -0400
On Sat, Apr 21, 2012 at 03:02:22AM +0200, Imsieke, Gerrit, le-tex scripsit:
> You can strip the accents by unicode decomposition and then removing
> the diacritical marks:
> 
> <xsl:for-each-group select="index-0"
>   group-by="substring(
>               upper-case(
>                 replace(
>                   normalize-unicode(heading, 'NFKD'),
>                   '[&#x300;-&#x36f;]',
>                   ''
>                 )
>               ), 1, 1
>             )">
>   <xsl:sort select="current-grouping-key()"/>

Thank you!

I had tried decomposing, using replace with \p{Lm} and then recomposing
with NFKC, and that didn't work, but it was also fairly late on Friday
afternoon.

> When writing the group (= starting letter) to an output file further
> down in you template, you should sort it according to the
> upper-case(b&) part as first sort key, then according to the actual
> heading as a second (tie-breaker) sort key.
> 
> So itbs best to make a function (call it, e.g., my:sortkey) out of
> upper-case(b&).

Yes.

> In that function, you can also do other useful stuff, such as
> eliminating stop words or replacing all numbers with a zero, so that
> everything that starts with a number will be in the same group.

Fortunately these are very uncomplicated headings, so no stop words, but
the point about numbers is very well taken.

Thanks!
Graydon

Current Thread