Re: [xsl] > replaced by ">", < is not replaced...

Subject: Re: [xsl] > replaced by ">", < is not replaced...
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Fri, 13 Jul 2007 12:50:47 +0200
Jethro Borsje wrote:

Actually I want all the HTML things to be preserved, other things which are removed by the transition are things like: &#160;, which I all want to keep.

What do you mean by "removed"? Depending on the output method you choose, the No-Break Space, or U+00A0, can be serialized as a numeric entity (either &#160; or &#xA0;), or can be serialized as encoded character (in UTF-16 this would be x00A0 or xA000 depending on the byte order). If you use an output encoding that does not support the serialized character (in the case of the NBS, the US-ASCII or ISO-646 equivalent are examples that do not support it) it will be output as a numeric entity reference. Only exception is when you output it as text (which you don't seem to do), in which case an error will occur if the tree cannot be serialized.


Note that for any XML or HTML client, it does not matter at all how a particular character is encoded, as long as it is in line with the chosen encoding. In the case of HTML, Michael Kay already pointed out that > is a "should be" encoded as &gt; character according to the spec. Nevertheless, by the best of my knowledge, there is no HTML client in existence that has trouble interpreting the pure > character correctly.

Cheers,
-- Abel Braaksma

Current Thread