Subject: RE: [xsl] I18N / UTF-8 versus US-ASCII|
From: "Sangal, Amit (STSD)" <amit.sangal@xxxxxx>
Date: Tue, 4 Apr 2006 16:07:17 +0530
I have less knowledge in this area. In my case also, XML (containing Korean character) need to travel from one machine(running in Korean/ko_KR locale) to another machine(running in english/en_US local) inside SOAP envelope. Is there any known disadvantage/limitation of using US-ASCII output encoding? Regards, Amit -----Original Message----- From: andrew welch [mailto:andrew.j.welch@xxxxxxxxx] Sent: Tuesday, April 04, 2006 3:16 PM To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: [xsl] I18N / UTF-8 versus US-ASCII On 4/4/06, Sangal, Amit (STSD) <amit.sangal@xxxxxx> wrote: > Hi, > > I'm facing some trouble using the Xalan-j 2.6.0 for transforming XML which contains Korean characters. > When I use UTF-8 encoding, it makes these characters into garbled mess, like G;EM > <xsl:output method="xml" encoding="UTF-8" indent="yes"/> > e.g > <Dependencies> > <Source>europeG;EM</Source> > <Target>email_node3</Target> > </Dependencies> > > But when output encoding is changed to US-ASCII, outcome is all right and I do not see any garbling of Korean characters. > <xsl:output method="xml" encoding="US-ASCII" indent="yes"/> > e.g. > <Dependencies> > <Source>europe퓨터</Source> > <Target>email_node3</Target> > </Dependencies> > Is it ok to use US-ASCII encoding? Yes it makes life much easier when encoding is effectively taken out of the equation. I always use us-ascii output encoding and leave the browser render the characters. It all gets far too painful when your encoding gets lost as the XML travels through servlets etc and your glorious multibyte characters become single byte rubbish. I'm know I'm wrong, and we should all use UTF-8 output encoding, and ensure everything else is UTF-8 aware, but it's just easier not to.