|
Subject: Re: [xsl] Diacritics in original document From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sun, 30 Aug 2015 07:10:16 -0000 |
Mark,
the C< is decomposed into a 'u' and into U+308 (combining diaeresis). You
can normalize it to C< using normalize-unicode() [1], as in
normalize-unicode('uL', 'NFKC')
You can check the result in oXygenbs XPath input field:
string-to-codepoints(normalize-unicode('uL', 'NFKC')) b 252, which is
U+FC, or 'C<'.
Gerrit
[1] http://www.w3.org/TR/xpath-functions/#func-normalize-unicode
On 30.08.2015 08:52, Mark Wilson pubs@xxxxxxxxxxxx wrote:
>
> I am working with an original xml file that says <?xml version="1.0"
> encoding="UTF-8"?>.
> However, elements whose values contain diacritics appear to be something
> else (see fC<r in the two examples below):
> XML rendition in Oxygen:
> Mittheilungen der tauschvereinigung fuCKr postwerthzeichen zu Elberfeld.
>
> Which, here in my email, using Western (Windows-1252) are rendered
> correctly as:
> Mittheilungen der tauschvereinigung fuLr postwerthzeichen zu Elberfeld.
>
> The text output from my transformations have the same problem.
>
> Do I need to change the encoding in my stylesheets? If so, how? Or is
> there a solution?
> Thanks,
> Mark
>
>
>
>
--
Gerrit Imsieke
GeschC$ftsfC<hrer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930
GeschC$ftsfC<hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard VC6ckler
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| [xsl] Diacritics in original docume, Mark Wilson pubs@xxx | Thread | Re: [xsl] using -it in command line, Michael Kay mike@xxx |
| [xsl] Diacritics in original docume, Mark Wilson pubs@xxx | Date | Re: [xsl] using -it in command line, Michael Kay mike@xxx |
| Month |