Re: [xsl] encoding and XSL Transformation

Subject: Re: [xsl] encoding and XSL Transformation
From: Chuck White <chuckwh@xxxxxxxxxxx>
Date: Tue, 10 Sep 2002 12:03:37 -0700
----- Original Message -----
From: "Earl Bingham" <earl@xxxxxxxxxxx>

>I'm using emacs and Internet Explorer 6.0 to view the output. Anyway, it is
>the decimal representation that I want as the output. I am working with XML
>that is being generated using C++ MFC and the Xerces C++ parser.

>What would you suggest I do to convert these characters to their decimal
>reference?

Based on your question, I have to assume your XML is being generated by a
Win 1252 app using the 1252, or Windows ANSI, code page as its encoding. So
you need to find a utility for converting the char references that are
mapped to I guess what you could call Unicode unknowns, since in Unicode
they're mapping to what are essentially private or unused code points. This
means that if you are using, for example, &#146; , unless you do your
transform using an output encoding of Windows-1252, the UTF-8 mapping of
&#146; will not be what you bargained for.

&#146; maps out to the right single quote mark in Win 1252, i acute in
MacRoman, and, for our purpsoses here, nothing in Unicode (actually, it's a
"private use" control character).
I have to think MS has a conversion utility for fixing the source doc. Try
looking beginning here, and drill down. I bet you'll find something:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicod
e_0a5v.asp

Failing that, you can look at this conversion table and try your own thing:

http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT


Cheers,

Charles White
The Tumeric Partnership
http://www.tumeric.net
chuck@xxxxxxxxxxx
http://www.javertising.com
________________________________________
Author, Mastering XSLT, Sybex Books
Co-Author, Mastering XML, Premium Edition, Sybex Books


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread