RE: [xsl] encoding shift_jis into an attribute

Subject: RE: [xsl] encoding shift_jis into an attribute
From: "Matthew Simoneau" <Matthew.Simoneau@xxxxxxxxxxxxx>
Date: Fri, 4 Jun 2004 14:04:25 -0400
Wendell, thanks for the information about disable-output-escaping.

Josh, my first reaction was to do as you suggest and decode these URI
scheme strings in my MATLAB or Java code and then recode them in the XML
scheme.  I couldn't get the decoding to work properly in my Java code.
If XML document contains the Unicode character sequence "25968 23398",
it is represented in the URI scheme as "%E6%95%B0%E5%AD%A6".  To try to
reconstitute these original characters with Java, I tried something like
java.net.URLDecoder.decode("%E6%95%B0%E5%AD%A6").  This returned the
nonsense sequence of Unicode characters "35624 65392 34756 65382", so I
started looking for other options.  Your e-mail made me go back and try
this again.  I realized it was taking every two bytes and making them
one character.  That is, decode was defaulting to Shift-JIS, my
platform's default.  I also saw that the decode method could take an
encoding as an optional second argument.  If I specify
java.net.URLDecoder.decode("%E6%95%B0%E5%AD%A6","UTF-8"), I get back my
original sequence "25968 23398".  Slapping a "&#" and ";" around each
gets it into the form I want.  If I can't find a standard Java class to
do that, I can easily write my own.

Thank you everyone for your help.

Sincerely,
Matthew Simoneau
The MathWorks, Inc.

Current Thread