RE: [xsl] Does XSLT contain an easy means of determining if a string contains a diacritic?

Subject: RE: [xsl] Does XSLT contain an easy means of determining if a string contains a diacritic?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Mon, 16 Nov 2009 09:23:26 -0000
Yes, there is a better way. You can use normalize-unicode() to turn the
string into decomposed normal form, in which all the diacritics become
separate characters, and then you can use replace() to get rid of the
diacritics:

replace(normalize-unicode($in, 'NFD'), '\p{IsCombiningDiacriticalMarks}',
'')

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay

> -----Original Message-----
> From: Mark Wilson [mailto:mark@xxxxxxxxxxxx]
> Sent: 16 November 2009 05:35
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Does XSLT contain an easy means of determining
> if a string contains a diacritic?
>
> Hi,
> I need to render Czech language strings containing diacritics
> into strings with the diacritics removed. The Czech alphabet
> has 16 lower case diacritics and a somewhat smaller set of
> upper case diacritics. The strings are expressed in  UTF-8. I
> do not need to retain case, but I must locate and replace all
> diacritics.
>
> My only plan so far is construct a gigantic <xsl:choose> to
> find strings containing at least one diacritic. Then I would
> need a gigantic <xsl:if> to change each diacritic into its
> unaccented counterpart.
>
> I wonder if there is a simpler method for turning, for
> example, a word like "Safarmk" [S, r, m] into Safarik? Any
> ideas or suggestions, Thanks, Mark

Current Thread