Re: [xsl] Does XSLT contain an easy means of determining if a string contains a diacritic?

Subject: Re: [xsl] Does XSLT contain an easy means of determining if a string contains a diacritic?
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Sun, 15 Nov 2009 23:42:31 -0600
At 2009-11-15 21:34 -0800, Mark Wilson wrote:
I need to render Czech language strings
containing diacritics into strings with the
diacritics removed. The Czech alphabet has 16
lower case diacritics and a somewhat smaller set
of upper case diacritics. The strings are expressed in  UTF-8.

The encoding is irrelevant to XSLT ... it is relevant to the XML processor inside your XSLT processor in order to know what the Unicode characters are, but XSLT just sees them as Unicode characters without an encoding.

I do not need to retain case, but I must locate and replace all diacritics.

My only plan so far is construct a gigantic
<xsl:choose> to find strings containing at least
one diacritic. Then I would need a gigantic
<xsl:if> to change each diacritic into its unaccented counterpart.

I wonder if there is a simpler method for
turning, for example, a word like "Safarmk" [S,
r, m] into Safarik? Any ideas or suggestions,

If you are using XSLT 2.0, have you tried:


normalize-unicode( $yourString, "NFC" )

... which will return fully formed characters from characters using
diacritics?

For example, this converts U+0065 U+0301 into U+00E9.

I hope this helps.

. . . . . . . . Ken


-- Vote for your XML training: http://www.CraneSoftwrights.com/s/i/ Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/ Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video Video lesson: http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18 Video overview: http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18 G. Ken Holman mailto:gkholman@xxxxxxxxxxxxxxxxxxxx Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/s/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal

Current Thread