Subject: Re: [xsl] output to iso-8859-1 of non-iso characters, what is required action From: "bryan rasmussen" <rasmussen.bryan@xxxxxxxxx> Date: Wed, 7 May 2008 17:08:15 +0200 |
> don't mix 'characters' with 'bytes'. iso-8859-1 is a codepage that assigns > a number of characters to certain bytes in the range of 0255. > > In XML a character may be displayed in different ways, all perfectly legal: > A, A A > yes, was there anything in the question that implied otherwise? > I seem to remember that it is totally up to the processor to select a > method. If you use Saxon there are special options to control that behaviour > (if you prefer native bytes, decimal or hex entities). > ok. But by reading the spec it seems to me that if you don't specify a method it has to do it automatically for you in the case of outputting text nodes in an XML document (personally I think it should do the same in comment nodes - not sure why it was decided not to), but to always fail on a text output. > Dropping characters is never an option. why not. If I want to go from UTF-8 to ISO 8859-1 for some reason the low level way would be to write something that went through every byte and checked if it was in range and if not remove it. In the case of a text output from XML it would be nice if by declaring the output in my stylesheet that this was the behavior. But it isn't so on text output using XSL 1 isn't useful because translate a poor solution for something that a declarative solution should handle well. I declare I have something of encoding x and I want something of encoding y, if I also declare an XML output is required the processor finds a solution for me. If I declare a text output it seems to think there is no possible solution. whereas the common solution is to remove what isn't allowed replace what isn't allowed. I think in that context fail doesn't seem very good. > If you want that you could easily > filter using translate() to remove all unwanted characters from text nodes. > given that translate (in XSL 1) of all non iso-8859-1 characters to an empty string is easy do you think you could send me one? :) Cheers, Bryan Rasmussen
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] output to iso-8859-1 of n, David Carlisle | Thread | Re: [xsl] output to iso-8859-1 of n, David Carlisle |
Re: [xsl] output to iso-8859-1 of n, David Carlisle | Date | [xsl] RE: Trouble with Excel spread, Hewitt, Cheryl |
Month |