Re: [xsl] XSLT splitting (grouping?) hierarchical structure

Subject: Re: [xsl] XSLT splitting (grouping?) hierarchical structure
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 10 Feb 2022 09:02:46 -0000
Since you're looking for design patterns, in Jackson Structured Programming (
revisited using modern terminology at http://mcs.open.ac.uk/mj665/JSPDDevt.pdf
) this is known as a "boundary clash" problem, and the usual solution is to
flatten the heirarchy into a sequence of leaf nodes each containing details of
its own ancestry, and then reconstruct the new heirarchy by a grouping
operation on this sequence of leaf nodes. The original JSP book from 1975 is
quite tough going nowadays, it all rather assumes you're well versed in
sort-merge processing of hierarchical data files on magnetic tape. But the
overall philosophy of transforming hierarchies using a pipeline of successive
tree-walking transformations is isomorphic to the world we live in.

Although it's instinctive to reach for an XSLT solution, I think I once solved
a problem like this at the SAX level: keep a stack of open elements, and when
you hit a <split/>, emit endElement events to close open elements up to a
certain level, then output the <split/>, then re-open the elements that you
closed, in reverse order; you've then got a structure that's relatively easy
to break into sections using conventional grouping.

Michael Kay
Saxonica

> On 10 Feb 2022, at 08:20, Matthieu Ricaud-Dussarget ricaudm@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Dear XSL List,
>
> It's not the first time I'm facing a splitting problem working with
publishing documents.
> I used to find kind of tricky/verbose solutions but I'm wondering if I'm
missing something obvious, especially with XSLT 3.0 new features ?
>
> My XML looks like this :
> <root>
>   <section>
>     <title>Title</title>
>     <content>
>       <p>paragraph #1</p>
>       <p>paragraph #2 to split <split id="split-1"/> here</p>
>       <p>paragraph #3</p>
>       <p>paragraph #4 <strong> to split <split id="split-2"/> here</strong>
if possible</p>
>       <p>paragraph #5</p>
>       <ul>
>         <li>Item #1</li>
>         <li>Item #2 to split <split id="split-3"/> here</li>
>         <li>Item #3</li>
>         <li>
>           <ul>
>             <li>Item #4</li>
>             <li>Item #5 to <em>split <split id="split-4"/></em> here if
possible</li>
>             <li>Item #6</li>
>           </ul>
>         </li>
>       </ul>
>       <p>paragraph #6</p>
>     </content>
>   </section>
> </root>
>
> The goal is to split the section on every <split> element (just like a page
would break the flowing text anywhere in the structure).
>
> Expected result :
> <root>
>     <section>
>       <title>Title</title>
>       <content>
>         <p>paragraph #1</p>
>         <p>paragraph #2 to split</p>
>       </content>
>     </section>
>     <split id="split-1"/>
>     <section>
>       <title>Title</title>
>       <content>
>         <p>paragraph #3</p>
>         <p>paragraph #4 <strong> to split</strong></p>
>       </content>
>     </section>
>     <split id="split-2"/>
>     <section>
>       <title>Title</title>
>       <content>
>         <p><strong> here</strong> if possible</p>
>         <p>paragraph #5</p>
>         <ul>
>           <li>Item #1</li>
>           <li>Item #2 to split </li>
>         </ul>
>       </content>
>     </section>
>     <split id="split-3"/>
>     <section>
>       <title>Title</title>
>       <content>
>         <ul>
>           <li> here</li>
>           <li>Item #3</li>
>           <li>
>             <ul>
>               <li>Item #4</li>
>               <li>Item #5 to <em>split</em></li>
>             </ul>
>           </li>
>         </ul>
>       </content>
>     </section>
>     <split id="split-4"/>
>     <section>
>       <title>Title</title>
>       <content>
>         <ul>
>           <ul>
>             <li> here if possible</li>
>           </ul>
>           <li>Item #6</li>
>         </ul>
>         <p>paragraph #6</p>
>       </content>
>     </section>
>   </root>
>
> My idea was to iterate from 1 to the number of split elements + 1 and
working on the section with tunnel params so I can test for each node if it's
before / after / in between (current) splits elements, and then decide to keep
the node or not according to this position.
>
> I already used this kind of solution on a similar problem, long time ago. So
I'll give it a try though I'm not not totally confident with it (because split
elements can appear as inline content here).
>
> Please let me know if you have ideas, if my solution is the right or wrong
way to go?
> Are there special design patterns for this kind of problem ?
> And last, have you ever faced this kind of splitting issue, any feedback
welcome :)
>
> Cheers,
> Matthieu Ricaud-Dussarget
>
> --
> Matthieu Ricaud-Dussarget
> +33 6.63.25.95.58
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
email <>)

Current Thread