Re: [xsl] mixed content grouping by whitespace

Subject: Re: [xsl] mixed content grouping by whitespace
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Tue, 13 Apr 2010 00:52:44 +0200
On 12.04.2010 11:37, James Cummings wrote:

      <xsl:for-each-group select="$sep/node()"
        group-adjacent="boolean(self::tei:seg[@type='sep'])">

This groups the nodes in the variable you've created by the boolean (so the truth or falsehood of whether the pattern matches? I didn't know you could do that in a group-* pattern) of the existence of the segs you've created on tei:seg/text() which mark the whitespace.

There are two flavours of grouping conditions: patterns and expressions. group-starting/ending-with require patterns while group-by and group-adjacent accept any XPath expression. The latter are being applied to each item of the so-called population in order to calculate grouping keys, the former match specific nodes in the population that will lead or terminate a group.


For all but the nodes marked-up as WS in our example, evaluating self::tei:seg[@type='sep'] yields the empty sequence. Since the empty sequence cannot be used as a grouping key for group-adjacent [1], its boolean value is calculated, which is false for empty sequences [2]. I could have used empty() instead of boolean() which would just flip each node's true()/false() key. In this case, I would have to swap the "when current-grouping-key" and the "otherwise" actions accordingly, or test="not(current-grouping-key())".

In the word wrap example, it's a matter of taste whether to use group-starting-with or group-adjacent. But try to tackle the group-adjacent example given in the spec [3] using group-starting-with (or group-ending-with), and you'll find yourself writing all kinds of complicated lookaheads and lookbehinds that for-each-group promised to liberate you from. The same holds for trying to solve group-starting-with problems using group-adjacent. There's a reason THey created all 4 forms of for-each-group. And THey saw it was good.

Gerrit

[1] http://www.w3.org/TR/xslt20/#err-XTTE1100
[2] http://www.w3.org/TR/xpath-functions/#func-boolean
[3] http://www.w3.org/TR/xslt20/#d5e21264


        <xsl:choose>
          <xsl:when test="current-grouping-key()">
            <xsl:value-of select="current-group()" />
          </xsl:when>

When it is one of those whitespace segs, then just put out the value of the whitespace, temporary element vanishes.

          <xsl:otherwise>
            <w xmlns="http://www.tei-c.org/ns/1.0";>
              <xsl:apply-templates select="current-group()"/>
            </w>
          </xsl:otherwise>

Otherwise, wrap it in a word element.

Current Thread