RE: Special entity characters in Shift-JIS XSL.

Subject: RE: Special entity characters in Shift-JIS XSL.
From: Kay Michael <Michael.Kay@xxxxxxx>
Date: Fri, 17 Dec 1999 10:48:57 -0000
> SAXON kinda does it: it issues a message that US-ASCII encoding
> is not supported and threatens to switch to UTF-8, but still 
> prints all special characters as numeric entities. Thanks again Mike ;-).
> 
What SAXON actually does (and thanks to you Nikolai for your initial work in
this area) is as follows. There are two separate things driven by the
requested encoding.

(a) Saxon tries to create a Java writer using the requested encoding. If the
Java VM doesn't want to know, Saxon traps the exception (generally an
UnsupportedEncodingException though there have been reports of a Java VM
that returns IllegalArgumentException) and tries again, this time requesting
UTF8, after putting a warning on System.err. (I haven't found any way of
discovering what set of encodings a Java VM will support. Can anyone help?)

(b) Saxon tries to load a subclass of com.icl.saxon.output.CharacterSet
corresponding to the named encoding. This is currently a fixed list of
character sets, it can only be changed by modifying and recompiling the
source of com.icl.saxon.output.CharacterSetFactory. The CharacterSet class
is used to decide which characters to pass to the Java writer for encoding,
and which characters to represent as numeric character references. If the
encoding is not one of those in the list, utf-8 is used. The character sets
currently on the list are ASCII, utf-8, iso-8859-1, and KOI8-R.

So if you request encoding=ASCII Saxon will probably use a UTF-8 writer to
write the output, but will output all non-ASCII characters as character
references.

Mike


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread