Re: [xsl] match string

Subject: Re: [xsl] match string
From: Anton Triest <anton@xxxxxxxx>
Date: Thu, 21 Oct 2004 23:59:31 +0200
Michael Kay wrote:

(The first one is strange: is text() really a function? And even then, why is "para//text()[1]" a valid pattern and "para(//text())[1]" isn't?)


Because para() isn't a function.


Of course... that was so obvious, that I didn't see it :-)

In 2.0 you could do match="text()[. is ancestor::para/descendant::text()[1]]".

In 1.0 you could to the same using generate-id() or count(.|x) for the identity test.

Yes! Thanks, that's exactly what I wanted.

So here's an updated stylesheet (Zsolt: I added some explanation below):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="split" select="3"/>


   <!-- identity template: copy all elements -->
   <xsl:template match="*">
       <xsl:copy>
           <xsl:copy-of select="@*"/>
           <xsl:apply-templates/>
       </xsl:copy>
   </xsl:template>

<!-- override on the first text descendant of a para -->
<xsl:template match="text()[count(.|ancestor::para/descendant::text()[1])=1]">
<xsl:call-template name="split-words"/>
</xsl:template>


<!-- wrap a <first> node around the first 3 words of a string -->
<xsl:template name="split-words">
<xsl:param name="count" select="0"/>
<xsl:param name="str1" select="''"/>
<xsl:param name="str2" select="normalize-space(.)"/>
<xsl:choose>
<xsl:when test="$count = $split">
<first><xsl:value-of select="$str1"/></first>
<xsl:value-of select="$str2"/>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="contains($str2,' ')">
<xsl:call-template name="split-words">
<xsl:with-param name="count" select="$count + 1"/>
<xsl:with-param name="str1" select="concat($str1,substring-before($str2,' '),' ')"/>
<xsl:with-param name="str2" select="substring-after($str2,' ')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="split-words">
<xsl:with-param name="count" select="$split"/>
<xsl:with-param name="str1" select="concat($str1,$str2)"/>
<xsl:with-param name="str2" select="''"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>


The first template is the identity template[9]: if this would be the only template in the stylesheet, the output would be an exact copy of the original input tree (elements and attributes, but without comment nodes or processing instructions).

The second template is overriding the identity transform for those particular elements you want to change: the first text node among the descendants of a para element. That would be "ancestor::para/descendant::text()[1]". Now you can't use the ancestor axis directly in a match pattern, but you can inside the [ ] predicate. In XSLT 2.0 you can say match="text()[. is $thatnode]" (is this node the same as $thatnode?) but not in 1.0: you have to use one of the two tricks that are also used in the Muenchian grouping technique[2]. I used "text()[count(.|$thatnode)=1]", the other way is "text()[generate-id(.)=generate-id($thatnode)]".

The third is a recursive template. It is called without passing any param, causing them to default to the value of their select attribute. In the starting state, $str1 is empty and $str2 holds the complete text. The template will move the first word of $str2 to $str1 and then proceed by calling itself again until the number of words to isolate is reached: test="$count = $split".

Best,
Anton


[9] http://www.dpawson.co.uk/xsl/sect2/identity.html [2] http://www.jenitennison.com/xslt/grouping/muenchian.html

Current Thread