Re: Entity Reference Question

Subject: Re: Entity Reference Question
From: Mike Brown <mike@xxxxxxxx>
Date: Fri, 27 Oct 2000 12:20:56 -0600 (MDT)
Lee Goddard wrote:
> XML is by default UTF-8, must support UTF-16,

Clarification:

UTF-8 will be assumed in the absence of both an encoding declaration and a
byte order mark at the beginning of the document. It is possible to omit
the encoding declaration but still have it be UTF-16.

Re: how to know what the relationship is between characters,
their numbers, and entities...

XSL is XML, and XML, like HTML, uses the ISO/IEC 10646-1:1993 'Universal
Character Set' (UCS). The I in ISO stands for Ivory Tower, so you can't
actually look at the standard online. Instead, you have to look at the
parallel Unicode Standard, the character code charts of which are
available for free at the unicode.org site. 

Numeric character references like &#1234; in XML and HTML are referring to
scalar values, which for the x0000-xFFFF range are the same as the Unicode
values you'll see in the charts. 

HTML goes a step further and uses a standard set of SGML entities for a
number of special characters. This is covered in the HTML specs and DTDs.
A handy set of declarations you can use can be found at
http://www.oasis-open.org/cover/xml-ISOents.txt

   - Mike
____________________________________________________________________
Mike J. Brown, software engineer at         My XML/XSL resources:
webb.net in Denver, Colorado, USA           http://www.skew.org/xml/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread