Re: [xsl] [XSLT 2 or 3 - Diacritics] Removal

Subject: Re: [xsl] [XSLT 2 or 3 - Diacritics] Removal
From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 8 Aug 2022 13:29:28 -0000
They are in the Mn category (https://www.fileformat.info/info/unicode/category/Mn/list.htm), so you might also use

'C)C C'' => normalize-unicode('NFKD') => replace('\p{Mn}', '')

There are 1950 characters in that category, and the combining diacritics block comprises just 112 of them, but I don't think it will do any harm if you discard all Mn characters.

Gerrit

On 08.08.2022 15:24, Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx wrote:
'C)C C'' => normalize-unicode('NFKD') => replace('[&#x300;-&#x36f;]', '')

yields 'eEc'

On 08.08.2022 15:14, Christophe Marchand cmarchand@xxxxxxxxxx wrote:
Hi list !

I know it's in archive, but I can not find the thread !

I want to translate all accented chars into their non-accented version :

C) -> e
C -> E
C' -> c

I can not remember the diacritic unicode block name to match them and use it in replace(2).

Does someone remember it ?

Thanks in advance,
Christophe

Current Thread