Subject: Re: [xsl] How to read the encoding of an XML document From: David Carlisle <davidc@xxxxxxxxx> Date: Thu, 25 Oct 2001 16:53:54 +0100 |
> When you say Unicode, does that equate to UTF-8, UTF-16, UTF-32 or > something else? No unicode is essentially an abstract collection of characters, numbered 1 to x10FFFF (most of which slots are empty). an XML notation of ō refers to that abstract character number 333. However to store unicode strings in files (and other places) you need some encoding that maps bytes in the file to these chracters. UTF-x are some of those encodings (all UTF encodings have the property that they can encode the whole unicode range) other encodings such as ascii or latin-1 are similar, but can't encode the whole range of characters. > Or does the answer depend upon the XML parser you are > using, which in my case is MSXML3.0? No. Internally the parser obviously has to use some encoding to store things (often this is utf-16, and it is in the case of msxml) in some programming api's you need to know this as you het handed the string, but in XSLT you never need to know what happens internally. Your XSLT stylesheet is an XML document so it goes through the same process. Character data in the stylesheet is mapped to abstract unicode characters (using the encoding specified in the stylesheet) and the same happens for the source document. It is these abstract characters that are compared. So by then you don't need to know (and can't find out) what encoding the original files contained. So your source might be in latin-2 and your stylesheet might be in latin-1 but by the time they have both been parsed everything is in abstract unicode characters and it is these that are compared in any XSLT query. (In fact MSXML3 uses utf16 but this is an internal detail that has no affect on the stylesheet) David _____________________________________________________________________ This message has been checked for all known viruses by Star Internet delivered through the MessageLabs Virus Scanning Service. For further information visit http://www.star.net.uk/stats.asp or alternatively call Star Internet for details on the Virus Scanning Service. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] How to read the encoding , David Carlisle | Thread | Re: [xsl] How to read the encoding , Wendell Piez |
RE: [xsl] How to read the encoding , Michael Kay | Date | RE: [xsl] counting characters in an, Wendell Piez |
Month |