[xsl] replace(), translate() and Unicode supplementary characters

Subject: [xsl] replace(), translate() and Unicode supplementary characters
From: Kenneth Reid Beesley <krbeesley@xxxxxxxxx>
Date: Thu, 2 Jun 2011 22:00:29 -0600
I've got an XSLT 2.0 stylesheet with the following my:print_ipa() function
defined,
using replace() and translate() with Unicode supplementary characters

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
	xmlns:my="myfunctions"
	exclude-result-prefixes="my">

	<xsl:output method="text"/>

	<xsl:function name="my:print_ipa">
		<xsl:param name="txt"/>
		<xsl:text>{\ipa </xsl:text>
		<xsl:value-of select="translate(
			replace(replace(replace(replace($txt,'p4','aJ2'),'p5','aJ7'),'p','I
J2'),'p','J2u'),
			'p(p)p*p+',
			'ieII')"/>
		<xsl:text>}</xsl:text>
	</xsl:function>

...
</xsl:stylesheet>

And I can't seem to get the translate() and replace() calls to work (I'm using
saxonhe9-3).

The intended purpose of the function is to translate UTF-8 strings in the
Deseret Alphabet
to equivalent strings in the International Phonetic Alphabet (IPA).

I realize that the characters may not display correctly in your mail
application.

The first argument to translate() is
replace(replace(replace(replace($txt,'p4','aJ2'),'p5','aJ7'),'p','IJ2
'),'p','J2u')
where the second arguments to replace() are strings containing one Unicode
supplementary character,
and the replacements (the third arguments) are strings containing Unicode IPA
characters, which
are in the BMP.  The characters in the first arguments are

U+00010434
U+00010435
U+0001044E
U+0001044F

Similarly in the translate() call, the second argument consists of the string

'p(p)p*p+'

i.e.  U+00010428  U+00010429  U+0001042A  U+0001042B

and the third argument is

'ieII'

i.e.  U+0069  U+0065  U+0251  U+0254

The intent is for U+00010428 to get replaced by U+0069, etc.

Questions:  Are translate() and replace() supposed to work with Unicode
supplementary characters?
If so, what am I doing wrong?

Thanks,

Ken


******************************
Kenneth R. Beesley, D.Phil.
P.O. Box 540475
North Salt Lake, UT
84054  USA

Current Thread