RE: [xsl] SAXON and UTF-8

Subject: RE: [xsl] SAXON and UTF-8
From: Tony Graham <Tony.Graham@xxxxxxxxxxxxxxx>
Date: Fri, 28 Sep 2001 10:20:52 +0100
Michael Kay wrote at 28 Sep 2001 09:25:29 +0100:
 > When I need to check what the XML spec says, I usually turn to Bob
 > DuCharme's book. Unfortunately this means I sometimes miss things that
 > changed in the second edition.

It's always been possible to use &#xFEFF; with UTF-8 in XML.  It just
wasn't mentioned in the XML Recommendation (and still isn't all that
explicit).

ISO/IEC 10646-1:1993 has "always" supported use of ZERO WIDTH NO-BREAK
SPACE (&#xFEFF;) as an encoding signature for UTF-8 (where "always"
probably means "since UTF-8 was added to ISO/IEC 10646-1:1993 as
Amendment 2 some time before there was a Unicode 2.0").

The Unicode side of the Unicode==ISO/IEC 10646 equation was ambivalent
(at best) about &#xFEFF; as an encoding signature for UTF-8 for quite
a long time after ISO/IEC 10646 blessed the idea, but the signature is
now listed as such in Section 13.6, Specials, of the Unicode Standard,
Version 3.0.

Regards,


Tony Graham
------------------------------------------------------------------------
XML Technology Center - Dublin        mailto:tony.graham@xxxxxxxxxxxxxxx
Sun Microsystems Ireland Ltd                       Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3            x(70)19708

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread