Subject: Re: [xsl] 8bit ascii encoding From: David Carlisle <davidc@xxxxxxxxx> Date: Fri, 23 Aug 2002 12:33:04 +0100 |
> Yeah... anywhere nice? I would say that it was suitably far from computers, but it seems that even 3000m up a swiss mountain you still expect to find an internet cafe these days (I resisted the urge to log in and answer any xsl-list messages though:-) > ha.. nice. After some testing it seems that char references display > fine, while characters themselves do not well presumably they would if you wrote the characters in the right encoding. Guessing it sounds like you are writing bytes that correspond to iso-8859-1 characters into a utf8 encoded stream. If so you'll get the wrong characters (or more often an error) except for that part of utf-8 that happens to use one byte per character. > I think the reason IE isn't picking up that each char is two > bytes (utf-8) If each char (in uniocde 2) is in 2 bytes you are using utf-16 not utf-8. (Unicode 3 requires more than 2 bytes per character even in utf-16, the so called surrogate pairs). utf-8 requires 1 - 5 bytes, depending on the character. > So I guess I have two options... > > 1. persevere trying to get IE to treat the output as two byte chars I think your problem is using the phrase "two byte chars" which leads to confusion. Characters have a unicode number but do not correspond directly to any number of bytes. Different encodings map subsets of the unicode character set into particular byte combinations. > 2. pass through all char refs to the output un-escaped, and let IE > escape them... All character references are replaced by the referenced character by an XML parser. So ther eis no way to "pass through" references unchanged. The XSLT system can not tell whether a reference or a character was in the original data. > Is this the best option? It is still not clear what you are trying to do but there should be bo real reason why your C part can not handle whatever encoding is coming out of the XSLT. It isn't clear from your description whether this is utf-8 or utf-16. You may find it easier if you specified encoding="iso-8859-1" and used latin-1 in the C part. David _____________________________________________________________________ This message has been checked for all known viruses by Star Internet delivered through the MessageLabs Virus Scanning Service. For further information visit http://www.star.net.uk/stats.asp or alternatively call Star Internet for details on the Virus Scanning Service. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] 8bit ascii encoding, Andrew Welch | Thread | Re: [xsl] 8bit ascii encoding, Thomas B. Passin |
RE: [xsl] Why processor or styleshe, TSchutzerWeissmann | Date | [xsl] How to match a child element , Biying Huang |
Month |