Re: [xsl] Unicode usage

Subject: Re: [xsl] Unicode usage
From: "Thomas B. Passin" <tpassin@xxxxxxxxxxxx>
Date: Fri, 25 Jan 2002 11:21:11 -0500
[Julian Reschke]

> It would depend on the User Agent, not the platform. If this is actually
> true for any "recent" version of IE (let's say, since 4.0), I'd like to
see
> some evidence before I believe it :-)
>
>

I just did an experiment that verified what each of us said. I created an
xml file on my Windows 2000 machine with a &#174; in it.  I transformed it
with an identity transform twice, first with encoding='utf-8 and second with
encoding='iso-8859-1'.

Looking at the hex bytes, the iso results contained a hex AE byte, which is
correct for character 174.  The utf-8 results contained the two hex
characters C2 AE, which I presume is right for utf-8.  Both results
displayed the registered trademark symbol, the one with the the r in a
circle.

I copied the results to a floppy and took it over to my Win95/SP2 computer,
then displayed the results in IE 5.5.  Both files displayed the same,
showing the right symbol.  This is what you said would happen.

I also loaded each result into Notepad on Win95.  Notepad displayed the iso
file correctly, but not the utf-8 result (it showed that "A" character with
a little circle above it), ahead of the trademark symbol.  This is what I
was suggesting would happen. BTW, Notepad on the Win2000 computer did
display both results correctly.

Summarizing, what you will see displayed for high-order characters can
depend on the encoding, OS,  and the viewing program.  On older versions of
Windows, at least, non-browsers are likely to display the wrong thing.

In fact, even on my Win2000 machine, using XML Cooktop to run and display
the transformation gave an incorrect display (and it uses the IE activeX
control to display the results!), so you can't be sure even on Win2000 that
high order characters will display the intended way, depending on the app.

Try it yourself on your system.  Here are the files:

-----------------------------------------------------------------------
<?xml version='1.0' encoding='utf-8'?>
<data>Here is a ==&#174;== character</data>

-----------------------------------------------------------------------
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output encoding='utf-8'/><!--Or change to iso-8859-1-->

<!-- Identity transformation template -->
<xsl:template match='*|@*'>
 <xsl:copy>
  <xsl:apply-templates select="@*|node()"/>
 </xsl:copy>
</xsl:template>

</xsl:stylesheet>
-----------------------------------------------------------------------
Cheers,

Tom P


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread