[xsl] XSLT 2.0: Character Output Issue

Subject: [xsl] XSLT 2.0: Character Output Issue
From: "Sam Byland" <shbyland@xxxxxxxxxxx>
Date: Wed, 18 Jul 2007 11:14:56 -0400
Hello all,

We're in the process of upgrading from XSLT 1.x (Saxon6) to XSLT 2.0 (Saxon8). It's a fairly significant effort (perhaps similar in scope to David's "big switch" http://dpcarlisle.blogspot.com/2007/06/big-switch.html), and I'm finally at the point where I'm reviewing differences in the output files and trying to make those differences go away :-).

Many of the instance files we deal with use various math symbols throughout them. I'll use the "not equal" symbol &#x2260; in the example of the problem below.

XML file:

<?xml version="1.0" encoding="UTF-8"?>
<Test>"Hello" &#x2260; "World"</Test>

XSLT1.1 stylesheet:

<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="html"/>
<xsl:template match="Test">
<xsl:document href="test.html">
<p><xsl:value-of select="."/></p>

XSLT2.0 stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output name="OutHTML" method="html"/>
<xsl:template match="Test">
<xsl:result-document href="test.html" format="OutHTML">
<p><xsl:value-of select="."/></p>

When I transform the isntance file above with the XSLT1.1 stylesheet using Saxon6, I get the following output:

       <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
       <p>"Hello" &#8800; "World"</p>

When I transform the instance file with XSLT2.0 stylesheet using Saxon8, I get the following output (I hope the output isn't mangled by the usual email limitations.....).

       <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
       <p>"Hello" b?  "World"</p>

In IE, I notice that these two display the same way as a web page. However, our output html files are eventually opened within MS Word, and only the Saxon6 output files show up correctly.

One thing I notice is that when I open the simplified example output files above, both display fine in MS Word. If someone can provide some advice as to where this problem might be coming from and only affecting our large output files from displaying right in Word, that would be great!.



Current Thread