Subject: Re: [xsl] International Characters in attributes From: David Carlisle <davidc@xxxxxxxxx> Date: Mon, 12 Feb 2001 10:47:35 GMT |
> That's partly why people still use encodings other than utf-8. And > once you do, the same numeric character references will mean different > things in different encodings, No, in XML and in HTML (4+) a numeric character reference always refers to the unicode position. It does not refer to th eposition in the current encoding. This is why I say that if your markup only uses ascii characters (as in XHTML) the you can encode your file in any encoding (eg us-ascii) without loss of information, as all characters are accessible via numeric references. The rules for XML are rather different than the rules for HTML (and the rules for HTML changed over time as it migrated from latin1 to unicode as its character repertoire). But for an XML system at least, it is clear that any system that claims to be xml conforming has to accept UTF8. (It isn't only Asian languages that you mention that use "long" utf8 byte sequences, Unicode 3.1 promises to add around a thousand mathematical aphanumeric symbols into plane 1, and these will be used by MathML systems) David XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] International Characters , Mike Brown | Thread | Re: [xsl] International Characters , Michael Beddow |
Re: [xsl] MSXML3 CAB File Redistrib, Francis Norton | Date | Re: [xsl] transformation of any XML, Oliver Becker |
Month |