Re: [xsl] Multiple search and replace

Subject: Re: [xsl] Multiple search and replace
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Wed, 02 Apr 2008 08:32:38 +0200
Hi Pankaj,

see my comments below:


Pankaj Chaturvedi wrote:
Hi all,

I am trying to define multiple search and replace in style sheet.

first thought: consider using XSLT 2.0, which has search and replace built in using the replace() function which can handle regular expression style search and replace in one call.


Basically trying to convert [#x02010] (and other Unicode values) to their
corresponding values &#x02010; .

Second thought consider using XSLT 2.0: getting the numeric value of a character can be done using string-to-codepoints, which is not available in XSLT 1.0. Second thought (b): sorry, I see that you mean the literal string '[#...]'....


Below is what I am trying to do:

<snip />

I have two questions in regard:

1. I am bound to define & as &amp; as XMLSpy giving an error "character is
grammatically unexpected". Is there other way of overcoming this issue and
get & in output.

Not doing something because your tool limits you is very dangerous... However, in this case, XMLSpy is correct. The underlying technology (XML) does not allow a literal &amp;, simply because XSLT *is* XML and XML does not allow it. However, if you output as text, the serializer will output '&' when you put &amp; somewhere.


Third thought: use XSLT 2.0. It has the ability to add character maps. In a character map you can say that some character, say '$' (but using something from the Private Area Unicode ranges is recommended) can be mapped to some string. Using character maps you can get a literal '&' in the output.

2. I also need to replace "]" to  ";" for which I was trying to call the
another template with in <xml:template match="text()"> as below but doesn't
seems to be working.

<snip />
Can we do multiple search and replaces in one named template or do I need to
define them all separately (I need to call all of them in one template
<xsl:template match="text()">).

XSLT is a functional language. You will have to call the replace function recursively. I believe there's an example on the exslt.org site which shows how you can do this for a multiple search and replace in a generic way.


Fourth thought: use XSLT 2.0. All you'll end up with then is a nested replace(replace(....)) call.

Fifth thought: use XSLT 2.0 for the whole shebang. Your whole solution will look like this:

<xsl:output use-character-maps="searchreplace" />

<xsl:character-map name="searchreplace">
   <xsl:output-character character="&#xE000;" string="&amp;" />
</xsl:character>

<xsl:template match="text()">
   <xsl:sequence select="replace(., '\[#(\d+)\]', '&#xE000;#x\1;')" />
</xsl:template>


Sixth thought: use XSLT 2.0. You seem to be using XMLSpy, which can handle XSLT 2.0. However, its engine is a bit flaky. If you run into problems, consider using either Gestalt XSLT 2.0 or Saxon XSLT 2.0 processors.


Note: you may think that putting &amp; inside the string-attribute of xsl:output-character creates &amp; in the output, but this is not true. Since XSLT is XML, you must put &amp; there. But to get the translation to serialize to literal &amp; instead means double escaping: "&amp;amp;" (but that is not what you are after here). Understanding the implications of using character references in XML is vital of headache-free working with XML and XSLT (plus all other XML related technologies in fact), but it can be hard at times to get it right in your head.

Hope this helps,

Cheers,
-- Abel Braaksma

Current Thread