RE: [xsl] Better Way to Group Siblings By Start/End Markers?

Subject: RE: [xsl] Better Way to Group Siblings By Start/End Markers?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 24 Jun 2008 00:04:03 +0100
Another possibility is to use xsl:for-each-group with group-starting-with.

I seem to remember that when I last did this, however, it turned out to be
easier using sibling recursion - that is, have each w:r element
apply-templates to its immediately following sibling.

Either way, processing Word XML using XSLT is not for the faint-hearted.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: Eliot Kimber [mailto:ekimber@xxxxxxxxxxxx] 
> Sent: 23 June 2008 23:04
> To: xsl-list
> Subject: [xsl] Better Way to Group Siblings By Start/End Markers?
> 
> I am experimenting with using XSLT to convert Office Open XML 
> into InCopy INCX (the CS3 Word import fails to capture some 
> things I need captured from the Word data).
> 
> One challenge is handling Word fields, which need to be 
> converted to any number of different, and 
> differently-structured, INCX constructs (whose details are 
> not important here).
> 
> A Word field is organized as a sequence of w:r elements 
> within a larger sequence of w:r elements. A field start is 
> indicated by a w:r with a field start indicator and the field 
> end is indicated by another w:r with a field end indicator. 
> The w:r elements between these two marker elements comprise 
> the field data, which can be any number of things, including 
> w:r elements that would easily occur outside the scope of the 
> field (e.g., w:r containing literal document content).
> 
> Here is a typical sample:
> 
>       <w:r>
>         <w:t xml:space="preserve">-  </w:t>
>       </w:r>
>       <w:r
>         w:rsidR="00BA1D13">
>         <w:fldChar
>           w:fldCharType="begin"/>
>       </w:r>
>       <w:r
>         w:rsidR="00BA1D13">
>         <w:instrText>HYPERLINK "http://www.example.com/";</w:instrText>
>       </w:r>
>       <w:r
>         w:rsidR="00BA1D13">
>         <w:fldChar
>           w:fldCharType="separate"/>
>       </w:r>
>       <w:r
>         w:rsidRPr="00B233E5">
>         <w:t>HTTP</w:t>
>       </w:r>
>       <w:r
>         w:rsidR="00BA1D13">
>         <w:fldChar
>           w:fldCharType="end"/>
>       </w:r>
> 
> I have this for-each-group that seems to group correctly, but 
> I'm wondering if there's a simpler expression that does what I want:
> 
> <xsl:for-each-group select="w:r"
> group-adjacent="
> string(self::*[w:fldChar[@w:fldCharType = 'begin' or 
> @w:fldCharType = 'end']] or 
> (self::*[preceding-sibling::*/w:fldChar[@w:fldCharType = 
> 'begin']] and 
> self::*[following-sibling::*/w:fldChar[@w:fldCharType = 
> 'end']] and 
> count((self::*[preceding-sibling::*/w:fldChar[@w:fldCharType 
> = 
> 'begin']])[1]/(*[following-sibling::*/w:fldChar[@w:fldCharType
>  = 'end']])[1]
> |
> (self::*[following-sibling::*/w:fldChar[@w:fldCharType = 
> 'end']])[1]) = 1
> ))
>    "
> >
> 
> In prose (at least this is what I intend the above expression 
> to mean): if w:r has child w:fldChar where @w:fldCharType = 
> 'begin' or 'end' or w:r has both a preceding sibling w:r with 
> a w:fldChar of type 'begin' and a following sibling w:r with 
> a w:fldChar of type 'end' AND the nearest preceding sibling 
> field start has the same nearest following sibling field end 
> as the current node, then return the grouping "true" else 
> return the grouping key "false".
> 
> Whew.
> 
> I can't think of a simpler way to say this. Is there one?
> 
> I realize I could factor some of the complexity of the 
> expression out into a function or two, which I will probably do.
> 
> Thanks,
> 
> Eliot
> 
> ----
> Eliot Kimber | Senior Solutions Architect | Really Strategies, Inc.
> email:  ekimber@xxxxxxxxxxxx <mailto:ekimber@xxxxxxxxxxxx>
> office: 610.631.6770 | cell: 512.554.9368 2570 Boulevard of 
> the Generals | Suite 213 | Audubon, PA 19403 www.reallysi.com 
> <http://www.reallysi.com>  | http://blog.reallysi.com 
> <http://blog.reallysi.com> | www.rsuitecms.com 
> <http://www.rsuitecms.com> 

Current Thread