Re: [xsl] strange encoding problem

Subject: Re: [xsl] strange encoding problem
From: Gregory Murphy <Gregory.Murphy@xxxxxxxxxxx>
Date: Fri, 1 Nov 2002 12:10:33 -0800 (PST)

On Fri, 1 Nov 2002, Andreas Schildbach wrote:

> i've got an utf-8 encoded xml file (test.xml) with an umlaut character, like
> this:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <a>ue</a> <!-- this is an &uuml; not the two chars u and e -->
> 
> [...] 
> 
> when i use tomcat, jsp and the jstl (java standard tag library) to apply the
> transformation
> 
> <%@ taglib prefix="x" uri="http://java.sun.com/jstl/xml"; %>
> <c:import url="test.xml" var="xml"/>
> <c:import url="test.xsl" var="xsl"/>
> <x:transform xml="${xml}" xslt="${xsl}"/>
> 
> the result is &Atilde;&frac14;
> which is NOT correct in my opinion.


The following hack might help you to work around the problem. Redefine the
character entity so that it refers to a numeric character entity. In other
words, make your XML look something like

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html [
  <!ENTITY uuml	 "&#x00FC;">
]>

I have found that, in general, numeric character entity references survive
repeated processing better than do the HTML named references.

// Gregory Murphy <Gregory.Murphy@xxxxxxx>
// Software Engineer
// Customer Network Platform, Sun Microsystems


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread