Re: [xsl] xsl:sort with msxml english language, danish characters, weird results

Subject: Re: [xsl] xsl:sort with msxml english language, danish characters, weird results
From: "W. Eliot Kimber" <ekimber@xxxxxxxxxxxxxxxxxxx>
Date: Mon, 25 Oct 2004 13:38:27 -0500
Michael Kay wrote:

I'm not sure I'm following here--at least using Java RuleBasedCollator you should be able to achieve any collation sequence whatsoever.

But I'm not sure what you mean by sorting 646 before 10646.



A possible algorithm is that any sequence of digits counts as a single
collation unit, which is collated before the first collation unit derived
from non-digit characters, and has a collation value equal to its decimal
value.

I don't believe you can achieve this with a RuleBasedCollator.

Ah, I understand now--I misunderstood your comment as being about the standards, not the strings "646" and "10646".


I think you are correct, although I'll have to test it.

Of course, this type of rule can be implemented using a custom Comparator implementation that implements whatever rule you want, delegating the character-level comparison to a rule-based collator. I don't think there's any way that a purely declarative mechanism, which is what I understand the UCA to define (and what RuleBasedCollator implements) to handle all cases.

Cheers,

E.
--
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8122

eliot@xxxxxxxxxxxxxxxxxxx
www.innodata-isogen.com

Current Thread