Subject: [xsl] Washington method From: Dave Pawson <davep@xxxxxxxxxxxxx> Date: Sun, 10 Apr 2011 09:45:44 +0100 |
Was Processing two documents. which order? Finally got my tiny mind round this one and I believe it is worth spending some time on to explain it. Problem: Some text, in XML preferably for which some parts are required to be marked up as XML in the output. The approach. An external file contains the word list, as xml. The main input file contains the text needing marking up. the 'word list' looks something like <x> <word>target:word</word> ... </x> The xslt contains the following <xsl:key name="words" match="word" use="."/> Options: 1 wanted simply to do the markup, no more processing hence the stylesheet had <xsl:template match="node()"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:template> If you want other processing then add templates as needed. The work is done in this template <xsl:template match="text()[not(parent::a or parent::b or parent::c ] priority="2"> <xsl:analyze-string select="." regex="[a-z][a-z\-:.]+"> <xsl:matching-substring> <xsl:choose> <xsl:when test="key('w',.,doc('../props.xml'))"> <tag> <xsl:value-of select="."/> </tag> </xsl:when> <xsl:otherwise> <xsl:value-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:value-of select="."/> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:template> 1. The regex should match on any character group that *may* contain one of the wanted words. I had to include - : and . since the text contained those characters. 2. The 'tag' element is used to markup matches. A candidate match occurs when the regex makes a hit, in the matching-substring element. A further selection is made, matching the key (from the external document). Only then does markup happen 3. I required not to markup text in some elements, hence the filtering not(parent::a or parent::b or parent::c ] which exludes the text from all these elements. In hindsight, the method does not use the character subtraction class, just the escaping needed (since I needed to match on word-nextword) confused me. Repetition against a parameter for the case I had took 15 minutes. Using this method, 4 seconds. In retrospect, it is a valuable addition to any toolkit IMHO. Washington method? From David Carlisle of course :-) Thanks David. -- regards -- Dave Pawson XSLT XSL-FO FAQ. http://www.dpawson.co.uk
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Grouping siblings accordi, David Carlisle | Thread | [xsl] RE: variable assignment and d, OHalloran, Martin |
Re: [xsl] Processing two documents,, Dave Pawson | Date | Re: [xsl] Processing two documents,, Liam R E Quin |
Month |