RE: [xsl] Tranformation failed with Saxon for "Illegal HTML]

Subject: RE: [xsl] Tranformation failed with Saxon for "Illegal HTML]
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Fri, 28 Jul 2006 22:54:49 +0100
> In My XML file unfortunately come characters like the one in
> the tag below:
>
> <ns0:ContactNumberValue>Capitale sociale Cb, 4.005.358.876
> i.v.</ns0:ContactNumberValue>
>

Unfortunately this doesn't tell us what's in your XML file, it only tells us
what the characters look like after being copied-and-pasted into your email
application, sent across the wires, and then displayed in our own email
applications.

What we really need to know is (a) what the XML declaration at the start of
your XML file says, and (b) what are the hex values of the bytes (octets)
making up this element in the original file.

However, I think that the two symbols Cb, are the default Windows glyphs for
the two bytes which form the UTF-8 encoding of the Unicode character with
codepoint x80. So if your file contains the character x80, and is encoded in
UTF-8, and is then displayed using software that doesn't know it's in UTF-8,
then this is what you will see on screen. Hopefully this will help you trace
back to the point where the character was first misencoded (which is quite
likely the point at which the data was first written out as XML).

Michael Kay
http://www.saxonica.com/

Current Thread