Re: [xsl] International Characters in attributes

Subject: Re: [xsl] International Characters in attributes
From: Mike Brown <mike@xxxxxxxx>
Date: Sun, 11 Feb 2001 17:21:45 -0700 (MST)
> But once you get into the areas of the                                
> BMP where utf-8 starts producing the "transformations" that the "t"   
> stands for, with 3 or even 5-byte sequences, none of the browsers I've
> looked at will behave 100% properly (and some XML parsers and XSLT
> engines can hiccup as well).
I would love to see a summary of your test results in this area!
> That's partly why people still use encodings other than utf-8. And
> once you do, the same numeric character references will mean different
> things in different encodings, (there aren't named entities in html   
> for the 20,000+ Chinese characters) and so show differently in the    
> browser. 
That really *shouldn't* be the case, although I believe some of the old
pre-1996 browsers did exhibit this behavior. A numeric character reference
is by definition, at least in XML and HTML, a reference to a code point in
the ISO/IEC 10646 coded character set. It should never change with the
encoding of the document containing the reference. e.g., &#166; means the
BROKEN BAR character, always, even though code point 166 in, say, ISO

   - Mike
Mike J. Brown, software engineer at            My XML/XSL resources: in Denver, Colorado, USA    

 XSL-List info and archive:

Current Thread