Re: [xsl] Character encoding problem

Subject: Re: [xsl] Character encoding problem
From: Uche Ogbuji <uche.ogbuji@xxxxxxxxxxxxxxx>
Date: Thu, 24 May 2001 16:56:04 -0600
Could you zip up and send these files?  Cut and paste isn't bringing over the 
UTF-16 properly.

Thanks.

--Uche

> Hi, folks--
> 
> I'm developing a simple XSLT transformation for selecting languages
> (English or Japanese) on a bilingual website. It takes a source XHTML
> document with paired headings in English and Japanese, e.g.:
> 
> 	 <p xml:lang="en">
>            [ some stuff in English ]
>          </p>
>          <p xml:lang="ja">
>            [ same content in Japanese ]
>          </p>
> 
> ... and outputs everything in the selected language plus any content
> that has no language specified. At least that's the theory. I've tried
> processing it w/ (full) Saxon and 4XSLT's command line interfaces, but
> keep getting errors:
> 
> Saxon:
> 	$ saxon main.html i18n.xsl currentLanguage=en
> 	Transform failed: =US-ASCII
> 
> 	The above 'saxon' is a simple shell script I wrote just to
> 	save typing. It just invokes 'java com.icl.saxon.Whatever
> 	[<args>]'.
> 
> 4XSLT:
> 	$ 4xslt -DcurrentLanguage=en main.html i18n.xsl
> 	[ long stack trace ]
> 	TypeError: argument(2) to filter() must be a sequence type
> 
> The 4XSLT error looks like a possible bug, but the Saxon output is
> just plain puzzling. Where is 'US-ASCII' coming from? I edit the
> source in EUC-JP, then convert it to UTF-8 or UTF-16 (same results
> either way) using iconv.
> 
> So, can anybody give me a clue? Any leads would be much appreciated.
> 
> Matt Gushee
> 
> 
> ---- i18n.xsl ---------------------------------------------
> 
> <?xml version="1.0"?>
> <!-- None of the commentings-out made any difference -->
> <xsl:stylesheet version="1.0"
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> 
>   <xsl:param name="currentLanguage" select="'en'"/>
> 
>   <xsl:variable name="charEncoding">
>     <xsl:choose>
>       <xsl:when test="$currentLanguage='en'">iso-8859-1</xsl:when>
>       <xsl:when test="$currentLanguage='ja'">euc-jp</xsl:when>
>       <xsl:otherwise>utf-8</xsl:otherwise>
>     </xsl:choose>
>   </xsl:variable>
> 
>   <xsl:output method="html" encoding="$charEncoding"/>
> 
>   <xsl:template match="/">
>     <xsl:apply-templates/>
>   </xsl:template>
> 
>   <!-- <xsl:template match="*[lang($currentLanguage) or not(@xml:lang)]"> -->
>   <xsl:template match="*[lang($currentLanguage)]">
>     <xsl:copy>
>       <!-- <xsl:for-each select="@*[name() != 'id']"> -->
>       <xsl:for-each select="@*">
> 	<xsl:copy/>
>       </xsl:for-each>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
>     
> </xsl:stylesheet>
> 
> 
> --- main.html [pre-conversion: euc-jp encoding] --------------
> 
> <?xml version="1.0" encoding="UTF-16"?>
> <!--
> <!DOCTYPE html PUBLIC
>   "-//W3C//DTD XHTML 1.1//EN"
>   "/usr/local/share/xml/xhtml/xhtml11.dtd"
> >
> -->
> <html xmlns="http://www.w3.org/1999/xhtml";
>   version="-//W3C//DTD XHTML 1.1//EN"
>   xml:lang="en">
>   <head>
>     <title>Welcome</title>
>   </head>
> 
>   <body xml:lang="en">
>     <h1 xml:lang="en">Welcome</h1>
>     <h1 xml:lang="ja">$B$h$&$3$=(B</h1>
>     <hr xmlns="http://www.w3.org/1999/xhtml"/>
>     <p xml:lang="en">
> The Kaiwa Club is an informal group for people who want to practice
> Japanese conversation. We welcome members at all levels of
> proficiency.
> </p>
>     <p xml:lang="ja">
> $B2qOC6f3ZIt$OF|K\8l$N2qOC$rN}=,$7$?$$?M$N$?$a$N%$%s%U%)!<%^%k$J%0%k!<%W$G(B
> $B$4$6$$$^$9!#%l%Y%k$O$+$+$o$i$:!"?7$7$$2q0w$rBg4?7^$7$F$*$j$^$9!#(B
> </p>
>   </body>
> </html>
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> 



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread