Subject: Re: [xsl] encoding and XSL Transformation From: Tony Graham <Tony.Graham@xxxxxxx> Date: Tue, 10 Sep 2002 19:31:51 +0100 |
Chuck White wrote at 10 Sep 2002 07:19:37 -0700: > Windows encodings within the range of 128-159 map out to a variety of > control characters in Unicode, so your problem begins with your source > document, not Xalan. Don't automatically equate byte values with character numbers (i.e., code points). Bytes in the range 128-159 when read as, say, ISO-8859-1 maps to a variety of control characters. Data in ISO-8859-1 when read as UTF-8 maps to a lot of junk, usually with a lot of illegal byte sequences. UTF-8 data read as UTF-16 undoubtedly reads as a lot of junk too. Data in a Windows code page when read as a Windows code page (in an XML context, when the encoding declaration specifies the right encoding) reads as a variety of characters that have Unicode code points that do not have a 1:1 correspondence with the numeric value of the bytes used to represent the characters. Regards, Tony Graham ------------------------------------------------------------------------ XML Technology Center - Dublin mailto:tony.graham@xxxxxxx Sun Microsystems Ireland Ltd Phone: +353 1 8199708 Hamilton House, East Point Business Park, Dublin 3 x(70)19708 XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] encoding and XSL Transfor, David Carlisle | Thread | RE: [xsl] encoding and XSL Transfor, Michael Kay |
[xsl] string manipulation, Jiang, Peiyun | Date | RE: [xsl] string manipulation, James Fuller |
Month |