RE: [xsl] encoding woes: ISO-8859-1 vs. UTF-8

Subject: RE: [xsl] encoding woes: ISO-8859-1 vs. UTF-8
From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx>
Date: Wed, 24 Jul 2002 09:05:31 +0100
> > ISO-8859-1 can only encode the characters in the
> > range 0-255.
> 
> That's what I thought as well.  How did saxon
> converted those two control chars into the proper
> encoding for &#8220; and &#8221; even though the input
> XML was marked as encoding in ISO-8859-1?  I was fully 
> expecting the import would fail, but somehow it was successful.

I have no idea. This isn't done by Saxon, it's done by the XML parser.
If you were using the default parser (AElfred), I think that it actually
accepts bytes x80-x9F with encoding="iso-8859-1", converting them into
characters x80-x9F.

> 
> Good point.  For export output, I changed encoding to
> UTF-8, that seems to have resolved the problem, now
> export is successful.  Open the exported CSV in Hex
> editor, those two chars are shown as Hex 93/94,
> respectively.
> 
Now I really am puzzled.

Michael Kay
Software AG
home: Michael.H.Kay@xxxxxxxxxxxx
work: Michael.Kay@xxxxxxxxxxxxxx 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread