|
Subject: Re: [xsl] [XSLT 2 or 3 - Diacritics] Removal From: "Graydon graydon@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 8 Aug 2022 13:43:13 -0000 |
On Mon, Aug 08, 2022 at 01:14:53PM -0000, Christophe Marchand cmarchand@xxxxxxxxxx scripsit:
> I want to translate all accented chars into their non-accented version :
>
> i -> e
> K -> E
> g -> c
>
> I can not remember the diacritic unicode block name to match them and use it
> in replace(2).
>
> Does someone remember it ?
I think you mean the Unicode decomposition trick to get rid of all the
accents:
<xsl:sequence
select="normalize-unicode($accents, 'NFD') => replace('\p{Mn}', '') => normalize-unicode('NFC')"
/>
(I think Mn is "Mark, nonspacing"; officially M is "Combining
Diacritical Marks".)
With the caveat that some things that look like accented characters are
real letters so far as the Unicode committee is concerned and this
doesn't work. (E.g., A-ring, U+00C5 and U+00E5, keeps the ring.)
--
Graydon Saunders | graydonish@xxxxxxxxx
^fs oferiode, pisses swa mfg.
-- Deor ("That passed, so may this.")
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] [XSLT 2 or 3 - Diacritics, Christophe Marchand | Thread | Re: [xsl] [XSLT 2 or 3 - Diacritics, Christophe Marchand |
| Re: [xsl] [XSLT 2 or 3 - Diacritics, Christophe Marchand | Date | Re: [xsl] [XSLT 2 or 3 - Diacritics, Christophe Marchand |
| Month |