Re: [xsl] Output: XML to XML scrambling unicode characters

Subject: Re: [xsl] Output: XML to XML scrambling unicode characters
From: David Carlisle <davidc@xxxxxxxxx>
Date: Mon, 4 Mar 2002 18:26:11 GMT
 For example: I know that the "& eacute;"
 in utf-8 should be "& #233;"

No, that isn't the utf-8 encoding for e acute it is an XMl numeric
character reference. the XML is written in ascii (and so in utf8 or
latin1). However to write an eacute in utf8 takes two bytes "Ã ©"
Just as to write it in latin1 takes 1 byte é.

> When I
> changed my encoding to iso-8859-1 as you suggested, the letter
> appears in my editor as a single character é.

yes typically an XSL system will output characters that are in your
specified coding as character data, and output any other characters as
XML character references. Note in the text output method it can not do
this so can only generate an error.

If you use utf8 then _every_ character is available as character data
and the system need never use the &# notataion. (That's what the U is
for in UTF)

David

_____________________________________________________________________
This message has been checked for all known viruses by Star Internet
delivered through the MessageLabs Virus Scanning Service. For further
information visit http://www.star.net.uk/stats.asp or alternatively call
Star Internet for details on the Virus Scanning Service.

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread