RE: [xsl] Problem with Chinese (Solution)

Subject: RE: [xsl] Problem with Chinese (Solution)
From: "Andrew Kimball" <akimball@xxxxxxxxxxxxx>
Date: Wed, 8 Aug 2001 13:00:14 -0700

We really need a KB article on encodings, because these same questions
come up repeatedly.

Here is one principle that can't be overstated: If you want encodings
preserved, don't use strings.  Strings are always encoded in UTF-16 on
the Win32 platform.  You can't ask MSXML to output to a GB2312 string.
It's a contradiction.  Either MSXML has to output GB2312 bytes, or it
has to output a UTF-16 encoded string.  It can't do both.  Use stream
methods (like load, transformNodeToObject) to output bytes, and string
methods (like loadXML, transformNode) to output strings.

In your example code, you were dealing almost totally in strings,
causing major encoding headaches:
1. loadXML() takes a UTF-16 string as an argument
2. responseText returns the response converted to a UTF-16 string
3. transformNode() returns a UTF-16 string
4. Response.Write() takes a UTF-16 string as an argument.  It then
magically converts the string to a byte stream encoded using the current
session codepage.

The solution is to rewrite your code to avoid caching intermediate
output in string form:

    // Get response object
    // responseXML returns a document created by parsing the response
stream, so no need to call load
    var xmlResponseDoc = xmlPostObject.responseXML;

    // Create stylesheet object
    var xslFile = "nihao.xsl";
    var xslDoc = new ActiveXObject("MSXML2.DOMDocument");
    xslDoc.async = 0;

    // Apply stylesheet to xml
    // Do not allow intermediate result to be cached in a string.
    // Instead, output directly to the ASP response stream in order to
preserve the requested encoding.
    xmlResponseDoc.transformNodeToObject(xslDoc, Response);

I also recommend setting Response.Charset = "GB2312" as well as
Response.Content="text/html" so that the browser can be quickly informed
of the content and encoding of the incoming page (this avoids
auto-detection logic).

This solution is faster and cleaner than the string solution.  Use
strings when you want to display output, not when you're just shuttling
it to the next piece of a processing pipeline.

~Andy Kimball

 XSL-List info and archive:

Current Thread