Subject: RE: [xsl] xsl:sort with msxml english language, danish characters, weird results From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Mon, 25 Oct 2004 13:26:07 +0100 |
> What are > the rules for accenting? I suppose that most people if you > asked them what x was > would say that's an o with a slash through it, and an f was > an a and an e stuck > really close together, hence mnemonic entities, but is that > the rule for > determining what is an accented character? We asked 100 > people and 90 gave the > following answer? I used the term "accent" very loosely. For the full gory detail, see the Unicode Collation Algorithm [1]. I don't know if Microsoft follow this precisely, but they are probably using the same principles. As for how they collected the data - yes, they probably asked a few non-randomly selected people, and they looked in some (possibly out of date) textbooks, and when they got it badly wrong people complained and they sometimes fixed it. There isn't a single right answer - different publishers sort their dictionaries and indexes and phone books in different ways, and none of them is wrong. The UCA is written as if there is a single correct answer, but there isn't. Michael Kay http://www.saxonica.com/ [1] http://www.unicode.org/reports/tr10/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] xsl:sort with msxml engli, Bryan Rasmussen | Thread | Re: [xsl] xsl:sort with msxml engli, W. Eliot Kimber |
Re: [xsl] easy one, Hardy Merrill | Date | [xsl] xsl history, Pawson, David |
Month |