Re: [xsl] replacing diacritical marks with combining unicode characters

Subject: Re: [xsl] replacing diacritical marks with combining unicode characters
From: Terry Ofner <tofner@xxxxxxxxxxx>
Date: Tue, 4 Mar 2008 16:42:14 -0500
I have decided to try a character map. When I used the sheet below, however, it converts <br /> to <br>:

?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="2.0">



<xsl:output use-character-maps="cm1"/>


<xsl:character-map name="cm1">
<xsl:output-character character="&#728;" string="&amp;#774;"/><!--breve-->
<xsl:output-character character="&#175;" string="&amp;#772;"/> <!-- macron -->
</xsl:character-map>


<xsl:template match="br"/>

        <xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:template>

</xsl:stylesheet>

I really don't need the <br /> elements. That is the reason for the template rule removing them. But is the switch from <br/> to <br> caused by the output line?

Terry Ofner
1541 Northbrook Drive
Indianapolis, IN 46260
Voice: 317-870-1992
Fax: 317-870-7101

tofner@xxxxxxxxxxx




On Mar 4, 2008, at 2:09 PM, Michael Kay wrote:



The function fn:normalize-unicode() will do what you want, with a second argument of "NFC".

I'm not sure it will, because the input is using non-combining diacritical
marks. I think the answer is translate():


translate($in, '&#728;...', '&#774;...'

Michael Kay
http://www.saxonica.com/

Current Thread