Subject: Re: [xsl] National Language Collating Sequences and Index Generation From: Joerg Pietschmann <joerg.pietschmann@xxxxxx> Date: Fri, 08 Feb 2002 10:39:03 +0100 |
"W. Eliot Kimber" <eliot@xxxxxxxxxx> wrote: > I have to generate back-of-the-book indexes for many national languages, > including Arabic, Hebrew, Thai, Simplified Chinese, Traditional Chinese, > Korean, and Japanese. I've successfully adapted the Docbook index > generation code to produce the basic index, but now I'm faced with the > challenge of both doing correct sorting for these languages and > generating the appropriate index groups. That's an interesting topic and a real, already acknowledged but in general not quite solved problem. In XSLT 1.0, xsl:sort sorts strings lexically by Unicode code point number, IIRC. Localized sorting by a single character should also relatively easy to implement if you can get hold of the collating sequence: <xsl:stylesheet ... xmlns:coll="my.collating.sequence"/> <coll:sequence> <char char="A" number="1"/> <char char="B" number="2"/> ... </coll:sequence> <xsl:variable name="collseq" select="document('')/*/coll:sequence"/> ... <xsl:for-each select="$items"> <xsl:sort select="$collseq[@char=substring(current()/name,1,1)]/@number"/> You can try to add <xsl:sort select="$collseq[@char=substring(current()/name,2,1)]/@number"/> and so on for more compete lexical sorting. It could be of some use that you could define fractional numbers for the sorting keys: <char char="A" number="1"/> <char char="Ä" number="1.1"/> <!-- sorry for the entity :-) --> <char char="a" number="1.5"/> The caveats are that you better have a complete collating sequence, and that you shouldn't expect a great performance, especially if you add a lot of sort clauses. There is also the possibility that you run afoul unexpected character normalisation issues, users could expect that ä and ä are interchangable (at least i think so). In XSLT/XPath 2.0, you can have named collating sequences, but you shouldn't expect the ones you need are provided by the runtime system :-(((( HTH J.Pietschmann XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] National Language Collati, Jeni Tennison | Thread | RE: [xsl] National Language Collati, Michael Kay |
[xsl] counter in xsl, thenewmatrix | Date | RE: [xsl] querystring parameters, Andrew Welch |
Month |