Re: [xsl] mixed content grouping by whitespace

Subject: Re: [xsl] mixed content grouping by whitespace
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Sun, 11 Apr 2010 20:19:54 -0400
At 2010-04-11 21:17 +0200, Imsieke, Gerrit, le-tex wrote:
On 11.04.2010 19:43, David Carlisle wrote:
> On 11/04/2010 17:37, James Cummings wrote:
> group-starting-with="text()">
>
>
> probably that wants to be text()[not(normalize-space(.))]
>
> David

Couldn't get the desired result like that.
I applied a two-step process:
1. Mark up whitespace using intermediate <seg @type="sep"> </seg>;
2. group adjacent WS (and non-WS) nodes, put the non-WS groups in a newly created w element.

When I completed my solution I found it almost identical to yours, except that I used group-starting-with=. I also could not think of a way to do it in one pass.


. . . . . . . . . . Ken

T:\ftemp>type cummings.xml
<ab xmlns="http://www.tei-c.org/ns/1.0";>
   <seg>foo blort wibble</seg><lb/>
   <seg>foo-<m>blort</m> wibble</seg><lb/>
   <seg><w>foo</w>-<m>blort</m> wibble</seg><lb/>
</ab>

T:\ftemp>xslt2 cummings.xml cummings.xsl
<?xml version="1.0" encoding="UTF-8"?><ab xmlns="http://www.tei-c.org/ns/1.0";>
   <seg><w>foo</w> <w>blort</w> <w>wibble</w></seg><lb/>
   <seg><w>foo-<m>blort</m></w> <w>wibble</w></seg><lb/>
   <seg><w><w>foo</w>-<m>blort</m></w> <w>wibble</w></seg><lb/>
</ab>
T:\ftemp>type cummings.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                xmlns:t="http://www.tei-c.org/ns/1.0";
                xmlns:my="urn:X-temp"
                xmlns="http://www.tei-c.org/ns/1.0";
                exclude-result-prefixes="t my"
                version="2.0">

<xsl:template match="t:seg">
  <xsl:copy>
    <xsl:copy-of select="@*"/>
    <!--break up the text nodes in order to be grouped-->
    <xsl:variable name="content" as="node()*">
      <xsl:apply-templates mode="translate"/>
    </xsl:variable>
    <!--now group the content-->
    <xsl:for-each-group select="$content"
                        group-starting-with="my:text">
      <xsl:copy-of select="self::my:text/node()"/>
      <w>
        <xsl:copy-of select="current-group()[not(self::my:text)]"/>
      </w>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

<xsl:template match="text()" mode="translate" priority="1">
  <xsl:analyze-string select="." regex="\s+">
    <xsl:matching-substring>
      <!--encapsulate the white-space so it can be detected and recreated-->
      <my:text><xsl:value-of select="."/></my:text>
    </xsl:matching-substring>
    <xsl:non-matching-substring>
      <xsl:value-of select="."/>
    </xsl:non-matching-substring>
  </xsl:analyze-string>
</xsl:template>

<!--identity template; relies on built-in template rules for text-->
<xsl:template match="@*|node()" mode="#default translate">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

T:\ftemp>

--
XSLT/XQuery training:         San Carlos, California 2010-04-26/30
Principles of XSLT for XQuery Writers: San Francisco,CA 2010-05-03
XSLT/XQuery training:                 Ottawa, Canada 2010-05-10/14
XSLT/XQuery/UBL/Code List training: Trondheim,Norway 2010-06-02/11
Vote for your XML training:   http://www.CraneSoftwrights.com/s/i/
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
G. Ken Holman                 mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal

Current Thread