Re: [xsl] > replaced by ">", < is not replaced...

Subject: Re: [xsl] > replaced by ">", < is not replaced...
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Fri, 13 Jul 2007 14:38:47 +0200
Jethro Borsje wrote:
With 'removed' I mean: "replaced by the actual character". So "&#160;" becomes a " " (SPACE)

no, it becomes a NBSP (which is binary encoded as x00A0 in UTF-16, or xA0 in ISO-8859-1), it looks like a space, but is not. If your processor or serializer makes the character a space (i.e., binary encoded as x0020 in UTF-16 or x20 in ISO-8859-1, UTF-8 and many other US-ASCII compatible encodings) then it is a bug with your serialization software and you should file it with the makers.


in the output, "&gt;" become a ">". This is not what I want. Not because of visualization in an HTML client, but because I have to be able to map the output of the XLS back to some original input which contains the "&gt;", "&#160;", etc. I basically just want to keep the character encodings from the input in the output.


Aha, now we're getting at the root of the problem. You need to compare the things. If you compare XML documents, or HTML documents, the comparison software should be able to deal with numerical character entities (and named character entities) and treat a ">" equal to "&gt;" equal to "&#x3E;" equal to "&#062".


If your software cannot make that kind of comparison, you can try several things, of which the following come to mind:

1. Post process the output to get your original char ents back
2. Use XSLT 2 instead and use character maps to force the char ents
3. Use different comparison software that is capable of reading XML the proper way (you mention HTML, but from your code it seems that you create XML)
4. Load and Save the original XML into the XSLT processor you built, one with transformation, other with copy-of, this will serialize both the same way and makes comparison on a byte by byte basis easier (but still not solid)
5. Use XSLT to do the comparison (XSLT 2 is preferred: just use deep-equal() function if equality is all you are after)


I cannot recommend anything else than option 3 and 5, all others are workarounds that after time will make your software weak or at least vulnerable to bugs.

HTH,

Cheers,
-- Abel Braaksma

Current Thread