Re: [xsl] Apache Xalan 2.2 for Java problems with Unicode

Subject: Re: [xsl] Apache Xalan 2.2 for Java problems with Unicode
From: "Rob Lugt" <roblugt@xxxxxxxxx>
Date: Thu, 9 Aug 2001 12:41:45 +0100
Jamie King wrote

> I'm trying to transform an XML file (encoded in UTF-8) using Apache's
Xalan
> 2.2 package for Java.  It gives me the following exception:
>
> javax.xml.transform.TransformerException: An invalid XML character
(Unicode:
> 0xfc) was found in the element content of the document.
>
> Has anyone experienced this?  Unicode 0xFC is a lowercase 'u' with an
umlaut
> (ü).  It works fine when I remove those characters.  Is there a way to set
> the encoding for the Transformer object in Java or something like that?

Jamie, I don't have experience of Apache's Xalan, so I'm unable to test my
hypothesis...

You are correct that Unicode U+00FC is a valid XML character.  I'd be
surprised if the XML Parser you are using (xerces?) complained about this.
Is it possible that the error message is misleading and the encoding of your
input file is wrong?  You say the file is encoded in UTF-8.  The UTF-8
representation of U+00FC is 0x3CBC.  Is this what you have in your file?

Regards
~Rob

--
Rob Lugt
ElCel Technology
http://www.elcel.com/



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread