[xsl] Problem: XSLT, attribute value, Unicode supplementary characters

Subject: [xsl] Problem: XSLT, attribute value, Unicode supplementary characters
From: Kenneth Reid Beesley <krbeesley@xxxxxxxxx>
Date: Sat, 17 Apr 2010 23:24:46 -0600
I've got a problem with XSLT transformation of attribute values consisting of
Unicode supplementary characters.

Background:

1.  OS X  10.6.3
2.  saxonhe9-2-0-6j
3.  The task:  transforming an XML document into XeTeX (specifying <xsl:output
method="text" encoding="UTF-8"/> )
4.  The XML document is well-formed and also validates against a Relax NG
schema.
5.  The XML document is designated as <?xml version="1.0" encoding="UTF-8"?>
6.  The locale of the operating system is UTF-8


Typical Data:

XML:      <case correctda="pp0p;">pp0p;</case>

The value of the attribute named correctda is here a short string of three
Deseret Alphabet letters, from the Unicode supplementary area.

Matching XSLT template:

<xsl:template match="pleft/text/case">{\da <xsl:value-of
select="@correctda"/>\endnote{\rom Case correction: {\da <xsl:value-of
select="."/>} $\rightarrow$ {\da <xsl:value-of
select="@correctda"/>}}}</xsl:template>

Behavior:

1.  The key problem is the output of the attribute value, via <xsl:value-of
select="@correctda"/>.  Instead of outputting the value pp0p;, as
expected, the output is instead a long string of unrelated Deseret Alphabet
characters.   It's as if the value-of function is being confused by the
Unicode supplementary characters.

2.  This XSLT script was working a couple of months ago.  Since then, I did
upgrade to OS X 10.6 (Snowleopard), and in trying to fix the current problem,
I upgraded to saxonhe9-2-0-6j as well.  The problem persists.

Question:

Does anyone know what's happening and how I can fix it?  Has something changed
in the handling of Unicode supplementary characters?

Thanks,

Ken


******************************
Kenneth R. Beesley, D.Phil.
P.O. Box 540475
North Salt Lake, UT
84054  USA

Current Thread