Subject: RE: [xsl] How to read the encoding of an XML document From: "Diamond, Jason" <Jason.Diamond@xxxxxxx> Date: Thu, 25 Oct 2001 18:03:44 -0600 |
> > while UTF-16 uses 2 bytes for most characters. > since it's gone midnight and I no longer need to be helpful in this > thread I could query the definition of most here, xFFFF not being most > of x10FFFF by some definitions of most. (Although depending whether you > view an unallocated unicode slot as a character, the numbers might be > different) If the Unicode scalar value is less that 0xFFFF it only requires two bytes using UTF-16 to encode but if it's greater than 0xFFFF then UTF-16 represents that value using a "surrogate pair" which is four bytes total in length. Since most Unicode characters have a value that's less than 0xFFFF, most characters will only require two bytes to encode. UTF-16 can encode all characters in the 0 to 0x10FFFF range. And so can UTF-8 and UTF-32. UCS-2, however, cannot encode characters above 0xFFFF. Jason. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Can't pass parameters acr, Joerg Pietschmann | Thread | Re: [xsl] How to read the encoding , David Carlisle |
Re: [xsl] How to read the encoding , David Carlisle | Date | RE: [xsl] use of starts-with(), Chris Bayes |
Month |