RE: [xsl] Recognized Unicode characters?

Subject: RE: [xsl] Recognized Unicode characters?
From: "Edward Bryant" <bryant_edward@xxxxxxxxxxx>
Date: Mon, 09 May 2005 08:08:47 -0500
Thanks for responding, but I think you guys lost me.
Here is the xslt header info I used:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="html"/>


I set output to HTML because that is the output I am creating. (isn't this right?)

As for the encoding, I have to admit I am confused. I picked UTF-8 mostly due to general recommendations for its use in learning-xml books and websites (and that it is the default), but none I have seen explain why with any detail or why anyone might use something different. The special characters in my source xml file are all character references to the Unicode numbers (&#___;, etc.)

As I understand it, shouldn't the XSLT processor know from the "encoding" attribute that the references will be to Unicode numbers and read them correctly as those characters. I also understand that the processor has some flexibility in how it outputs the text, but that it will often output special characters as entity references (e.g., the "&" symbol as "&amp;").

So, I am still confused why a Unicode reference to #8212 won't output correctly? The ouput displays a square box in both the browser (IE6) as well as in the HTML source itself (viewed via Windows notepad).

> > > Shouldn't that be <xsl:output encoding="US-ASCII"... for safety?
> >
> > Neither is completely safe of course,
>

The spec only requires support for UTF-8 and UTF-16, anything else is
optional.

I personally use "iso-646" as the name of this encoding. The differences are
immaterial (different names for some of the characters, I believe) but I
prefer international standards as a matter of principle.


Michael Kay
http://www.saxonica.com/

Current Thread