RE: [xsl] Global parameters with UTF-8 characters and ???s <Disregard Previous>

Subject: RE: [xsl] Global parameters with UTF-8 characters and ???s <Disregard Previous>
From: "Waters, Michael, Springer US" <Mike.Waters@xxxxxxxxxxxx>
Date: Wed, 02 Aug 2006 17:18:19 -0400
What kind of question mark are you seeing?

A white question mark on a black background is the Unicode Replacement
Character (U+FFFD) and usually indicates an encoding problem, i.e., the
encoding that is detected by the browser doesn't match the actual byte
encoding returned for the character.

An ordinary looking question mark usually indicates a simple font problem,
i.e., the browser's default font is not capable of rendering the character.

Since you do see Japanese characters, I'm assuming there are no font problems
here.

If your browser is correctly detecting UTF-8 sent by your JSP and if you do
see the Unicode Replacement Character, then it's possibly a misencoding of
what's being returned by your server, even though all your code is correctly
set up for UTF-8. If your JSP page is GETting or PUTting form data, especially
in an i18n context, there are special encoding issues involved. Check out:

http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset

and

http://homel.vsb.cz/~dvo25/reily/books/javaenterprise/servlet/ch12_06.htm

Mike Waters





>-----Original Message-----
>From: David Nesbitt [mailto:dnesbitt@xxxxxxxxxxxxxxxxx]
>Sent: Wednesday, August 02, 2006 3:24 PM
>To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
>Subject: RE: [xsl] Global parameters with UTF-8 characters and ???s
><Disregard Previous>
>
>
>> Does the same problem affect the same character if it
>originates from
>> a place other than the global parameter? For example, what happens
>> when you do <xsl:value-of select="'&#x....'"/>?
>
>I placed the following in my stylesheet:
>
>&#x65E5;&#x672C;&#x8A9E;<xsl:value-of
>select="'&#x65E5;&#x672C;&#x8A9E;'"/><xsl:value-of
>select="$global.parameter"/>
>
>It returns 9 question marks.  So that seems consistent.
>
>However, here is the weird thing.  If I change my Java code to do the
>escaping as follows:
>
>transformer.setParameter(key,
>StringEscapeUtils.escapeXml(resourceBundle.getString(key)));
>
>and then change my stylesheet to use disable-output-escaping for the
>global parameter as follows:
>
>&#x65E5;&#x672C;&#x8A9E;<xsl:value-of
>select="'&#x65E5;&#x672C;&#x8A9E;'"/><xsl:value-of
>select="$global.parameter" disable-output-escaping="yes"/>
>
>I get 6 question marks and then 3 Japanese characters.  So this really
>confuses me.
>
>Thanks again for your help.
>
>Regards,
>Dave
>
>-----Original Message-----
>From: Michael Kay [mailto:mike@xxxxxxxxxxxx]
>Sent: Wednesday, August 02, 2006 11:39 AM
>To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
>Subject: RE: [xsl] Global parameters with UTF-8 characters and ???s
>
>>
>> I am having problems with global parameters which have UTF-8
>> characters in them.  They show up as question marks when I use their
>> values in the output (e.g. <xsl:value-of
>> select="$global-parameter"/>).
>
>Does the same problem affect the same character if it originates from a
>place other than the global parameter? For example, what happens when
>you do <xsl:value-of select="'&#x....'"/>?
>
>If the problem occurs in this situation, then the two possible
>explanations are
>
>(a) your output device isn't configured to display the character (no
>glyph in the chosen font)
>
>(b) the software used to display the output (e.g. a text editor or a
>browser) doesn't know that the output is encoded in UTF-8.
>
>If the problem doesn't occur in this situation, then the
>problem is with
>the contents of the parameter, which probably means it's
>something to do
>with the encoding of the resourceBundle.
>
>Michael Kay
>http://www.saxonica.com/
>
>
>>
>> I am using JAXP (with Xalan 2.6 as the underlying XSLT
>> engine) from a JSP page to generate HTML.
>>
>> My JSP page has the following setting for UTF-8:
>>
>> <%@ page contentType="text/html;charset=UTF-8" language="java" %>
>>
>> My XSLT stylesheet has the following XML declaration:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> And the stylesheet also has the following output element:
>>
>> <xsl:output method="html" indent="yes" encoding="UTF-8"/>
>>
>> I am using the following Java code to set the global parameters:
>>
>> transformer.setParameter(key, resourceBundle.getString(key));
>>
>> So I think I am setting everything up properly for UTF-8.  Is there
>> anything I am doing wrong that is causing these characters
>to be shown
>
>> as question marks?
>>
>> Thanks in advance for any help.
>>
>> Regards,
>> Dave

Current Thread