Re: [xsl] Better Way to Group Siblings By Start/End Markers?

Subject: Re: [xsl] Better Way to Group Siblings By Start/End Markers?
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Mon, 23 Jun 2008 20:23:23 -0400
At 2008-06-23 17:04 -0500, Eliot Kimber wrote:
A Word field is organized as a sequence of w:r elements within a larger
sequence of w:r elements. A field start is indicated by a w:r with a field
start indicator and the field end is indicated by another w:r with a field
end indicator. The w:r elements between these two marker elements comprise
the field data, which can be any number of things, including w:r elements
that would easily occur outside the scope of the field (e.g., w:r containing
literal document content).

Here is a typical sample:

      <w:r>
        <w:t xml:space="preserve">-  </w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="begin"/>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:instrText>HYPERLINK "http://www.example.com/";</w:instrText>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="separate"/>
      </w:r>
      <w:r
        w:rsidRPr="00B233E5">
        <w:t>HTTP</w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="end"/>
      </w:r>

From what I can tell you only have one group above ... I couldn't test your code (or mine) with only the one group, so I mocked up a second group, but it obviously isn't right because I couldn't get your code to work with two groups in the siblings.


Or I haven't understood something.

I have this for-each-group that seems to group correctly, but I'm wondering
if there's a simpler expression that does what I want:

<xsl:for-each-group select="w:r"
group-adjacent="
string(self::*[w:fldChar[@w:fldCharType = 'begin' or @w:fldCharType =
'end']] or
(self::*[preceding-sibling::*/w:fldChar[@w:fldCharType = 'begin']] and
self::*[following-sibling::*/w:fldChar[@w:fldCharType = 'end']] and
count((self::*[preceding-sibling::*/w:fldChar[@w:fldCharType =
'begin']])[1]/(*[following-sibling::*/w:fldChar[@w:fldCharType = 'end']])[1]
|
(self::*[following-sibling::*/w:fldChar[@w:fldCharType = 'end']])[1]) = 1
))
   "
>

I've copied your code in my example below to try and use as a controlled result.


In prose (at least this is what I intend the above expression to mean): if
w:r has child w:fldChar where @w:fldCharType = 'begin' or 'end' or w:r has
both a preceding sibling w:r with a w:fldChar of type 'begin' and a
following sibling w:r with a w:fldChar of type 'end' AND the nearest
preceding sibling field start has the same nearest following sibling field
end as the current node, then return the grouping "true" else return the
grouping key "false".

Whew.

That is prose for what is implemented, but I'm not sure what it is you want.


I can't think of a simpler way to say this. Is there one?

Do you mean to say "I want all <w:r> sibling abutted elements between and including those indicating the start and the end of a field".


If the fields do not overlap, and if I've interpreted your intent, then I think my code is a simpler way of implementing what is desired.

I realize I could factor some of the complexity of the expression out into a
function or two, which I will probably do.

My approach could be expanded slightly to work with XSLT 1, but I've taken a short-cut to use an XSLT 2 technique for comparison.


I'm guessing it is wrong because my two-group mockup data set does not get grouped using your code. If you take a moment to post a data set that has more than one group, I would like to try my code on it.

I hope this helps.

. . . . . . . . . . . . Ken

t:\ftemp>type eliot.xml
<?xml version="1.0" encoding="US-ASCII"?>
<test xmlns:w="urn:x-w">
      <w:r>
        <w:t xml:space="preserve">-  </w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="begin"/>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:instrText>HYPERLINK "http://www.example.com/";</w:instrText>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="separate"/>
      </w:r>
      <w:r
        w:rsidRPr="00B233E5">
        <w:t>HTTP</w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="end"/>
      </w:r>

      <w:r>
        <w:t xml:space="preserve">-  </w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D14">
        <w:fldChar
          w:fldCharType="begin"/>
      </w:r>
      <w:r
        w:rsidR="00BA1D14">
        <w:instrText>HYPERLINK "ftp://www.example.com/";</w:instrText>
      </w:r>
      <w:r
        w:rsidR="00BA1D14">
        <w:fldChar
          w:fldCharType="separate"/>
      </w:r>
      <w:r
        w:rsidRPr="00B233E5">
        <w:t>FTP</w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D14">
        <w:fldChar
          w:fldCharType="end"/>
      </w:r>
    </test>

t:\ftemp>type eliot.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                xmlns:w="urn:x-w"
                version="2.0">

<xsl:output indent="yes"/>

<xsl:template match="test">
  <xsl:comment>Eliot's code:</xsl:comment><xsl:text>
</xsl:text>

<xsl:for-each-group select="w:r"
group-adjacent="
string(self::*[w:fldChar[@w:fldCharType = 'begin' or @w:fldCharType =
'end']] or
(self::*[preceding-sibling::*/w:fldChar[@w:fldCharType = 'begin']] and
self::*[following-sibling::*/w:fldChar[@w:fldCharType = 'end']] and
count((self::*[preceding-sibling::*/w:fldChar[@w:fldCharType =
'begin']])[1]/(*[following-sibling::*/w:fldChar[@w:fldCharType = 'end']])[1]
|
(self::*[following-sibling::*/w:fldChar[@w:fldCharType = 'end']])[1]) = 1
))
   "
>
  <xsl:comment>Here starts a group:</xsl:comment><xsl:text>
</xsl:text>
  <xsl:copy-of select="current-group()"/>
  <xsl:comment>Here ends a group</xsl:comment><xsl:text>
</xsl:text>
</xsl:for-each-group>

  <xsl:comment>Ken's code:</xsl:comment><xsl:text>
</xsl:text>

<xsl:for-each select="w:r[w:fldChar/@w:fldCharType='begin']">
  <xsl:comment>Here starts a group:</xsl:comment><xsl:text>
</xsl:text>
  <xsl:copy-of select="key('wrs',generate-id(.))"/>
  <xsl:comment>Here ends a group</xsl:comment><xsl:text>
</xsl:text>
</xsl:for-each>

</xsl:template>

<!--
     This key table has all w:r elements, but only those that are in a
     group have a non-empty-string lookup value, the lookup value being
     the generated identifier of the w:r element that starts the group.
-->
<xsl:key name="wrs" match="w:r"
 use="generate-id( (self::*[w:fldChar/@w:fldCharType=('begin','end')][1]
                           [w:fldChar/@w:fldCharType='begin'] |
      preceding-sibling::w:r[w:fldChar/@w:fldCharType=('begin','end')][1]
                           [w:fldChar/@w:fldCharType='begin'])
                   [last()])"/>

</xsl:stylesheet>
t:\ftemp>call xslt2 eliot.xml eliot.xsl eliot.out

t:\ftemp>type eliot.out
<?xml version="1.0" encoding="UTF-8"?>
<!--Eliot's code:-->
<!--Here starts a group:-->
<w:r xmlns:w="urn:x-w">
        <w:t xml:space="preserve">-  </w:t>
      </w:r>
<!--Here ends a group-->
<!--Here starts a group:-->
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:instrText>HYPERLINK "http://www.example.com/";</w:instrText>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:fldChar w:fldCharType="separate"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidRPr="00B233E5">
        <w:t>HTTP</w:t>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:fldChar w:fldCharType="end"/>
      </w:r>
<w:r xmlns:w="urn:x-w">
        <w:t xml:space="preserve">-  </w:t>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:instrText>HYPERLINK "ftp://www.example.com/";</w:instrText>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:fldChar w:fldCharType="separate"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidRPr="00B233E5">
        <w:t>FTP</w:t>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:fldChar w:fldCharType="end"/>
      </w:r>
<!--Here ends a group-->
<!--Ken's code:-->
<!--Here starts a group:-->
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:instrText>HYPERLINK "http://www.example.com/";</w:instrText>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:fldChar w:fldCharType="separate"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidRPr="00B233E5">
        <w:t>HTTP</w:t>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D13">
        <w:fldChar w:fldCharType="end"/>
      </w:r>
<!--Here ends a group-->
<!--Here starts a group:-->
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:instrText>HYPERLINK "ftp://www.example.com/";</w:instrText>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:fldChar w:fldCharType="separate"/>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidRPr="00B233E5">
        <w:t>FTP</w:t>
      </w:r>
<w:r xmlns:w="urn:x-w" w:rsidR="00BA1D14">
        <w:fldChar w:fldCharType="end"/>
      </w:r>
<!--Here ends a group-->

t:\ftemp>rem Done!



--
Upcoming XSLT/XSL-FO hands-on courses:      Wellington, NZ 2009-01
World-wide corporate, govt. & user group XML, XSL and UBL training
RSS feeds:     publicly-available developer resources and training
G. Ken Holman                 mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0    +1(613)489-0999 (F:-0995)
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal

Current Thread