Re: [xsl] Problems with entities

cknell@xxxxxxxxxx wrote:

If you are using the UTF-8 encoding, for examle, the ó character is represented by ó

Actually, the encoding doesn't matter--what matters is the character set, which is always Unicode for XML.

That is, the character ó (Latin small letter o with acute) is that character in the Unicode character set regardless of the encoding.

Characters are an abstraction. A character set is nothing more than an arbitrary mapping of abstract characters to unique numbers by which those characters can be referenced. In Unicode, each character also has a unique name that can be used instead of the character code to refer to the character (although no all processors know how to resolve these names).

The encoding simply determines how the characters are written to disk as sequences of bytes. For example, in UTF-8 encoding this character is written as a single byte (because its code is less than 255, the point at which UTF-8 uses 3 or more bytes per character), but the UTF-16 encoding is written as two bytes because UTF-16 uses two bytes for each of the first 65K characters of the Unicode Basic Multilingual Plane. In both cases the character (the abstraction) is the same: lowercase o with acute.

To read an XML file, the XML processor must first read the sequence of bytes on disk and then interpret that byte sequence as a sequence of characters. Therefore, it must know the encoding because the same sequence of bytes may result in different characters (or be invalid) depending on the encoding it is interpreted as.

Cheers,

Eliot
--
W. Eliot Kimber
Innodata Isogen
eliot@xxxxxxxxxx
www.isogen.com

XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list

Current Thread
RE: [xsl] Problems with entities cknell - Wed, 19 Nov 2003 15:11:15 -0500 W. Eliot Kimber - Wed, 19 Nov 2003 15:12:23 -0600 <= <Possible follow-ups> cknell - Wed, 19 Nov 2003 15:38:54 -0500 Jaime A Stuardo Bahamondes - Wed, 19 Nov 2003 16:49:07 -0400 Jaime A Stuardo Bahamondes - Wed, 19 Nov 2003 17:22:27 -0400 Jaime A Stuardo Bahamondes - Wed, 19 Nov 2003 17:45:12 -0400

<- Previous	Index	Next ->
RE: [xsl] Problems with entities, cknell	Thread	RE: RE: [xsl] Problems with entitie, cknell
Re: [xsl] Re: XSL in HTML, Alex Hildyard	Date	RE: [xsl] Problems with entities, Jaime A Stuardo Baha
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home