Re: [xsl] Preserving numeric character entity reference

Subject: Re: [xsl] Preserving numeric character entity reference
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Wed, 14 Feb 2007 12:10:26 +0100
Kjetil Kjernsmo wrote:
I can't use the us-ascii output encoding, as is usually suggested in cases like these, because, the numeric character entity references are actually in the us-ascii range anyway (and we want UTF-8 for the rest).

Is this doable?

Are there any other obfuscation techniques that can be done easily in XSLT that you can suggest?


Hi Kjetil,

You didn't specify the version of XSLT you are using, but here are a couple of ways that work with XSLT 2:

1. Use xsl:character-map to create literal strings like &#32; for space and &#x41; for A, without the XML serializer interfering with it.
2. Use strings like &amp;#32; if obfuscation is all you want, and make sure to translate them back (translating character entities never cascade, i.e., &amp;32; stands for the string "&#32;" and not for the entity &#32;, as such you need to do extra translation to go from &amp;#32; back to a space char)
3. Use CDATA sections. But this creates different strings, the same way as with (2).
4. Use string-to-codepoints() to get the codepoints (which map directly to the entities) of the string.


I think (2) and (3) will work well with XSLT 1 as well, but both require you to do extra work in your obfuscating tool. If you "just" want to have your text translated to character entities, without the serializer (correctly) making them into readable US-ASCII, option (1) is the easiest and most extensible to do.

Cheers,
-- Abel Braaksma

Current Thread