Michael Kay wrote:
(The first one is strange: is text() really a function? And
even then, why is "para//text()[1]" a valid pattern and
"para(//text())[1]" isn't?)
Because para() isn't a function.
Of course... that was so obvious, that I didn't see it :-)
In 2.0 you could do match="text()[. is ancestor::para/descendant::text()[1]]".
In 1.0 you could to the same using generate-id() or count(.|x) for the identity test.
Yes! Thanks, that's exactly what I wanted.
So here's an updated stylesheet (Zsolt: I added some explanation below):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="split" select="3"/>
<!-- identity template: copy all elements -->
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<!-- override on the first text descendant of a para -->
<xsl:template
match="text()[count(.|ancestor::para/descendant::text()[1])=1]">
<xsl:call-template name="split-words"/>
</xsl:template>
<!-- wrap a <first> node around the first 3 words of a string -->
<xsl:template name="split-words">
<xsl:param name="count" select="0"/>
<xsl:param name="str1" select="''"/>
<xsl:param name="str2" select="normalize-space(.)"/>
<xsl:choose>
<xsl:when test="$count = $split">
<first><xsl:value-of select="$str1"/></first>
<xsl:value-of select="$str2"/>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="contains($str2,' ')">
<xsl:call-template name="split-words">
<xsl:with-param name="count" select="$count
+ 1"/>
<xsl:with-param name="str1"
select="concat($str1,substring-before($str2,' '),' ')"/>
<xsl:with-param name="str2"
select="substring-after($str2,' ')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="split-words">
<xsl:with-param name="count" select="$split"/>
<xsl:with-param name="str1"
select="concat($str1,$str2)"/>
<xsl:with-param name="str2" select="''"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The first template is the identity template[9]: if this would be the
only template in the stylesheet, the output would be an exact copy of
the original input tree (elements and attributes, but without comment
nodes or processing instructions).
The second template is overriding the identity transform for those
particular elements you want to change: the first text node among the
descendants of a para element. That would be
"ancestor::para/descendant::text()[1]". Now you can't use the ancestor
axis directly in a match pattern, but you can inside the [ ] predicate.
In XSLT 2.0 you can say match="text()[. is $thatnode]" (is this node the
same as $thatnode?) but not in 1.0: you have to use one of the two
tricks that are also used in the Muenchian grouping technique[2]. I used
"text()[count(.|$thatnode)=1]", the other way is
"text()[generate-id(.)=generate-id($thatnode)]".
The third is a recursive template. It is called without passing any
param, causing them to default to the value of their select attribute.
In the starting state, $str1 is empty and $str2 holds the complete text.
The template will move the first word of $str2 to $str1 and then proceed
by calling itself again until the number of words to isolate is reached:
test="$count = $split".
Best,
Anton
[9] http://www.dpawson.co.uk/xsl/sect2/identity.html
[2] http://www.jenitennison.com/xslt/grouping/muenchian.html