Re: [xsl] msxsl encoding bug?

Subject: Re: [xsl] msxsl encoding bug?
From: Andrei Boyanov <andrei.boyanov@xxxxxxxxxxxxxxxxxx>
Date: Thu, 31 Mar 2005 14:04:05 +0300
Bryan Rasmussen wrote:

Anyone encountered this before,
when I run the following xslt over an xml file using the commandline msxsl.exe


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="1.0"
>


<xsl:output method="xml" encoding="UTF-8" omit-xml-declaration="no"/>

<xsl:template match="/">
<xsl:copy-of select="/"/>
</xsl:template>

</xsl:stylesheet>

and the xml file only has the english character set then the output has an xml
declaration of utf-8 but the actual content of the document is ANSI
if the xml file has other than the english character set then the output has an
xml declaration of utf-8 and the actual content of the document is utf-8.

Has anyone ever observed this bug before?


This is a ' bug' of the UTF-8 encoding. The english characters are encoded in 1 byte in this encoding. And this byte is the same as in the ASCII encoding. This is why you can't make the difference between utf-8 encoded file with only english letters and ASCII file.


Rgds,

--
Andrei Boyanov
CEO of Active Solutions Ltd.
http://activesolutions.bg; http://andrei.activesolutions.bg

Current Thread