Subject: Re: [xsl] Character 150 withs Windows-1252 output From: "andrew welch" <andrew.j.welch@xxxxxxxxx> Date: Fri, 21 Apr 2006 13:56:13 +0100 |
On 4/21/06, Michael Kay <mike@xxxxxxxxxxxx> wrote: > > Why is it that #150 gets escaped when using Windows-1252 > > output encoding when it should contain that character? > > Because there is no character in the Windows-1252 character set that > corresponds to the Unicode character with codepoint 150. Yes, thanks. That makes sense now. The thing I'm struggling with now is this: This source XML: <?xml version="1.0" encoding="Windows-1252" ?> <foo>–</foo> With this stylesheet: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output encoding="US-ASCII"/> <xsl:template match="/"> <xsl:copy-of select="."/> </xsl:template> </xsl:stylesheet> Gives this result: <foo>––</foo> I've checked the input file with a hex editor to make sure the un-escaped dash really is 0x96. Somehow the two characters are treated differently, which is something I didn't expect. I think that 0x96 in the input XML read using Windows-1252 should become #8211 when output using any encoding other than Windows-1252, which is what is happening for the actual character 0x96, but the character reference #150 gets serialised back as #150... Any thoughts?
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Character 150 withs Windo, Michael Kay | Thread | Re: [xsl] Character 150 withs Windo, Nic |
[xsl] lookup table problem, Alexander . RACHER | Date | Re: [xsl] Character 150 withs Windo, Nic |
Month |