Subject: Re: [xsl] using xsl:message with UTF-8 characters From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx> Date: Mon, 23 Apr 2007 13:19:35 +0100 |
> However, this won't solve your problem with xsl:message > (sorry), because Saxon seems to emit the messages of > xsl:message and fn:trace using > Latin-1 encoding or similar (I believe it were nicer if Saxon > would output in UTF-8, but maybe this is Sun Java's problem, > not Saxon's).
I've done some further investigation, and it seems that Saxon 8.9 isn't actually working as designed here (it's fixed in my current development build, quite by chance, which confused matters). Java makes the decision what encoding to use for the output, and in my tests it is deciding to use CP1252 when running in my IDE (IntelliJ). Saxon should then find out from Java what encoding is being used, and replace all characters outside that encoding by XML character references. But in 8.9.0.3 the escaping of characters outside the character set supported by the output writer isn't happening: I will fix this. My example was poorly chosen because the three characters ªº€ can all be represented in CP1252.
I don't know how good Java is at getting the encoding right, for example whether it will use a different encoding if you use configuration options such as "cmd /u" identified by Abel. I'll do some experiments.
As far as I know, Java will use the "platform default encoding" unless told otherwise, which on a Windows machine is CP1252.
If you want to set the default encoding to something else from the command line, then you can use the "file.encoding" system property, eg:
cheers andrew
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] using xsl:message with UT, Michael Kay | Thread | Re: [xsl] using xsl:message with UT, Abel Braaksma |
RE: [xsl] xmlns attr for element no, Michael Kay | Date | Re: [xsl] using xsl:message with UT, Abel Braaksma |
Month |