Re: [xsl] Character encoding/representation from ISO-8859-1 to UTF-8

Subject: Re: [xsl] Character encoding/representation from ISO-8859-1 to UTF-8
From: "Soren Kuula s_kuula@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 14 Oct 2016 11:22:16 -0000
> On 11 Oct 2016, at 21:00, Bridger Dyson-Smith bdysonsmith@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> <?xml version="1.0" encoding="iso-8859-1"?>
> <documents>
> 	<document>The reality of the effect of natural ventilation in a residential
attic cavity has been the topic of many debates and scholarly reports since
the 1930C"b,b"s.</document>
> </documents>

It looks very much like
1) in the XML header you claim the document is ISO-8859-1 encoded, while
really
2) it is not. I can see that one character, that b , was decoded as three
(C"b,b"). Had the document really been encoded with ISO-8859-1, any decoding
would have ended up with at most one character (because ISO-8859-1 does not
use multibyte characters).

try to replace biso-8859-1b in the xml header with butf-8b, does that
work?

Regards, Soren

Current Thread