Subject: Re: [xsl] International Characters in attributes From: Mike Brown <mike@xxxxxxxx> Date: Sun, 11 Feb 2001 17:21:45 -0700 (MST) |
> But once you get into the areas of the > BMP where utf-8 starts producing the "transformations" that the "t" > stands for, with 3 or even 5-byte sequences, none of the browsers I've > looked at will behave 100% properly (and some XML parsers and XSLT > engines can hiccup as well). I would love to see a summary of your test results in this area! > That's partly why people still use encodings other than utf-8. And > once you do, the same numeric character references will mean different > things in different encodings, (there aren't named entities in html > for the 20,000+ Chinese characters) and so show differently in the > browser. That really *shouldn't* be the case, although I believe some of the old pre-1996 browsers did exhibit this behavior. A numeric character reference is by definition, at least in XML and HTML, a reference to a code point in the ISO/IEC 10646 coded character set. It should never change with the encoding of the document containing the reference. e.g., ¦ means the BROKEN BAR character, always, even though code point 166 in, say, ISO 8859-2 means LATIN CAPITAL LETTER S WITH ACUTE. - Mike ____________________________________________________________________ Mike J. Brown, software engineer at My XML/XSL resources: webb.net in Denver, Colorado, USA http://skew.org/xml/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] International Characters , Michael Beddow | Thread | Re: [xsl] International Characters , David Carlisle |
[xsl] call another XSL, Rosa I-Ting Cheng | Date | [xsl] diff between direct call and , Rosa I-Ting Cheng |
Month |