RE: [xsl] Using XSLT to add markup to a document

Subject: RE: [xsl] Using XSLT to add markup to a document
From: "McNally, David" <David.McNally@xxxxxxxxxx>
Date: Mon, 7 Jul 2003 11:25:54 -0400
Coming late to this thread, and I think everything has pretty much been
said, but had a query.  

I notice that most XSLT 1.0 solutions use recursive templates and wondered
if there is any benefit in a solution that re-applies templates to text
nodes many times, rather than explicitly calling a recursive template.  For
moderately big files it seems to perform quite well, though I haven't tested
it too much, so there may be hidden problems.  I guess this approach is
essentially the same as recursion, but you don't have to figure out which
matched string comes first (though you do need node-set). 


Mark_up_text.xml:

<node>
	<para>
This is a sample document that deals with markup of <emph>text</emph>.
</para>
	<para> When one applies <emph>markup</emph> to a large document, one
is faced with 
a <def>time-consuming</def> effort.
</para>
	<para att="document markup">lkj markup kjlkj document lkj document
;lkj markup lkj;slakfj markup document</para>
</node>


Mark_up_text.xsl:


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:msxml="urn:schemas-microsoft-com:xslt">

<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="text()" priority="2"><!-- need priority to overcome the
node match below -->
	<xsl:call-template name="markup">
		<xsl:with-param name="text" select="."/>
	</xsl:call-template>
</xsl:template>
	
<xsl:template match="node()|@*">
	<xsl:copy>
		<xsl:apply-templates select="@*"/>
		<xsl:apply-templates/>
	</xsl:copy>
</xsl:template>	

<xsl:template name="markup">
	<xsl:param name="text"/>
	<xsl:choose>
		<xsl:when test="contains($text, 'document')">
			<xsl:apply-templates
select="msxml:node-set(substring-before($text,'document'))"/>
			<xsl:element name="special">document</xsl:element>
			<xsl:apply-templates
select="msxml:node-set(substring-after($text,'document'))"/>
		</xsl:when>
		<xsl:when test="contains($text, 'markup')">
			<xsl:apply-templates
select="msxml:node-set(substring-before($text,'markup'))"/>
			<xsl:element name="special">markup</xsl:element>
			<xsl:apply-templates
select="msxml:node-set(substring-after($text,'markup'))"/>
		</xsl:when>
		<xsl:otherwise><xsl:value-of select="."/></xsl:otherwise>
	</xsl:choose>
</xsl:template>



And my output is:


<node>
<para>
This is a sample <special>document</special> that deals with
<special>markup</special> of <emph>text</emph>. </para>
<para> When one applies <emph><special>markup</special></emph> to a large
<special>document</special>, one is faced with 
a <def>time-consuming</def> effort.
</para>
<para att="document markup">lkj <special>markup</special> kjlkj
<special>document</special> lkj <special>document</special> ;lkj
<special>markup</special> lkj;slakfj <special>markup</special>
<special>document</special></para>
</node>


Thanks,
David.
--
David McNally            Moody's Investors Service
Software Engineer        99 Church St, NY NY 10007 
David.McNally@xxxxxxxxxx            (212) 553-7475 



> -----Original Message-----
> From: Jim Melton [mailto:jim.melton@xxxxxxx] 
> Sent: Thursday, July 03, 2003 4:28 PM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Cc: jim.melton@xxxxxxx
> Subject: [xsl] Using XSLT to add markup to a document
> 
> 
> Gentlepeople,
> 
> I'm struggling with a problem that I fear isn't easily solved 
> with XSLT, 
> but there are many experts on this list who might be able to 
> help.  The 
> brief summary of my problem is that I want to find certain words that 
> appear in paragraphs throughout a very large (XML) document 
> and mark up 
> those words without making any other changes to my document.
> 
> For example, consider a document with the following fragment:
> 
> <para>
> This is a sample document that deals with markup of 
> <emph>text</emph>. </para> <para> When one applies 
> <emph>markup</emph> to a large document, one is faced with 
> a <def>time-consuming</def> effort.
> </para>
> 
> If one of the words to which I wish to apply markup is 
> "markup" and another 
> is "document", then I would want the result to be something like this:
> 
> <para>
> This is a sample <special>document</special> that deals with 
> <special>markup</special> of <emph>text</emph>.
> </para>
> <para>
> When one applies <emph><special>markup</special></emph> to a large 
> <special>document</special>, one is faced with a 
> <def>time-consuming</def> 
> effort.
> </para>
> 
> As you see from this example, I want to *add* markup to the 
> words I have 
> found where they appear in my result tree, but copy 
> everything else in my 
> document to the output tree unchanged.
> 
> I tend to use Saxon (currently using 6.5.2) as my primary 
> XSLT engine, but 
> I also have Microsoft's MSXML 4.0 (and could undoubtedly find 
> others if 
> required to do so).
> 
> Any guidance or advice?
> 
> Many thanks,
>     Jim 
> ==============================================================
> ==========
> Jim Melton --- Editor of ISO/IEC 9075-* (SQL)     Phone: 
> +1.801.942.0144
> Oracle Corporation            Oracle Email: 
> mailto:jim.melton@xxxxxxxxxx
> 1930 Viscounti Drive          
> Standards email: mailto:jim.melton@xxxxxxx
> Sandy, UT 84093-1063              Personal email: 
mailto:jim@xxxxxxxxxxx
USA                                                Fax : +1.801.942.3345
========================================================================
=  Facts are facts.  However, any opinions expressed are the opinions  =
=  only of myself and may or may not reflect the opinions of anybody   =
=  else with whom I may or may not have discussed the issues at hand.  =
========================================================================


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


---------------------------------------

The information contained in this e-mail message, and any attachment thereto, is confidential and may not be disclosed without our express permission.  If you are not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution or copying of this message, or any attachment thereto, in whole or in part, is strictly prohibited.  If you have received this message in error, please immediately notify us by telephone, fax or e-mail and delete the message and all of its attachments.  Thank you.

Every effort is made to keep our network free from viruses.  You should, however, review this e-mail message, as well as any attachment thereto, for viruses.  We take no responsibility and have no liability for any computer virus which may be transferred via this e-mail message.


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread