Re: [xsl] convert XML to NFC

Subject: Re: [xsl] convert XML to NFC
From: Simon Pepping <sampepping@xxxxxxxxx>
Date: Thu, 28 Apr 2011 14:38:09 +0200
Does this imply that C)lem (with LATIN SMALL LETTER E WITH ACUTE) and
eLlem (with LATIN SMALL LETTER E and COMBINING ACUTE ACCENT) are two
different element names? That would be a source of hidden errors,
because no editor will show the difference.

Simon

On Wed, Apr 27, 2011 at 21:39, Liam R E Quin <liam@xxxxxx> wrote:
> On Wed, 2011-04-27 at 17:14 +0100, David Carlisle wrote:
>> well it's probably quicker to use a tool treating the file as plain text
>> (since the unicode normalisation should not break the xml)
>
> A breakage to watch for - XML names don't have to be normalised, and
> normalization changes them, so e.g. id values, entity names, element
> names may be affected, and might no longer match corresponding names in
> a DTD or schema, and cross-document links may also be affected.
>
> If this is the case I'd argue the original input was broken, but if you
> need to normalise it, you already suspect that...
>
> For that reason, you might want to use XSLT with a slightly-modified
> identity transform as suggested, to leave the markup unaffected.
>
> Liam

Current Thread