Subject: Re: [xsl] Character encoding/representation from ISO-8859-1 to UTF-8 From: "Soren Kuula s_kuula@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 14 Oct 2016 11:22:16 -0000 |
> On 11 Oct 2016, at 21:00, Bridger Dyson-Smith bdysonsmith@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > <?xml version="1.0" encoding="iso-8859-1"?> > <documents> > <document>The reality of the effect of natural ventilation in a residential attic cavity has been the topic of many debates and scholarly reports since the 1930C"b,b"s.</document> > </documents> It looks very much like 1) in the XML header you claim the document is ISO-8859-1 encoded, while really 2) it is not. I can see that one character, that b , was decoded as three (C"b,b"). Had the document really been encoded with ISO-8859-1, any decoding would have ended up with at most one character (because ISO-8859-1 does not use multibyte characters). try to replace biso-8859-1b in the xml header with butf-8b, does that work? Regards, Soren
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Character encoding/repres, Bridger Dyson-Smith | Thread | [xsl] tokenize, Joga Singh Rawat jra |
[xsl] FW: tokenize, Joga Singh Rawat jra | Date | [xsl] Seek an XPath 2.0 expression , Costello, Roger L. c |
Month |