Re: [xsl] Global parameters with UTF-8 characters and ???s <Disregard Previous>

Subject: Re: [xsl] Global parameters with UTF-8 characters and ???s <Disregard Previous>
From: "andrew welch" <andrew.j.welch@xxxxxxxxx>
Date: Thu, 3 Aug 2006 10:24:16 +0100
On 8/2/06, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> Is this the right solution?  Or does it just point out what
> the issue is?

It's a viable workaround. But it suggests that there is some kind of
configuration problem somewhere, perhaps with the web server.

It effectively takes encoding out of the equation, the ascii characters & #nnn; are written to disk instead of a single unicode character, and the browser reads ascii instead of the single unicode character.

If you can see the correct characters in the browser now then it
suggests they are contained in the font that's being used, and the
problem lies with the file being written in one encoding and read in
another.  When the encoding doesn't contain a mapping for a given byte
sequence a question mark ? is used to mean "no mapping".

If you use a hex editor at every stage of the process to find out when
the bytes for the character ? are x3F (meaning the ? really is a ? and
its not just your viewer) then you'll know that the last stage was the
culprit.

If you are using Java then it's often the case of the setting default
platform encoding to UTF-8:

System.setProperty("file.encoding", "UTF-8"))

This ensures any operations that involve encodings (where an optional
encoding agument hasn't been specified, eg getBytes()) will use UTF-8.
If you don't specify this then ISO-8859-1 is used (on Windows
platforms anyway, afaik).

cheers
andrew

Current Thread