Re: [xsl] Problems grouping nested items within a completely flat structure

Subject: Re: [xsl] Problems grouping nested items within a completely flat structure
From: "Heiko Niemann kontakt@xxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 7 Aug 2014 23:10:34 -0000
Hi Frank,

well yes I had to make some assumptions to write the code and one was that
your sources would not change. :) I also assumed that 'Body text' could be
the trigger to start a group. But as far as I understand now also
'Bulleted text' can. I guess the tricky part here is that there is no
explicit para element that tells you where a list starts and that the
first 'Bulleted text' of a group is the initial item of the group and part
of the list at the same time which means you have to use modes. So now I
determine the beginning of a list as follows: the preceding sibling of the
first 'Bulleted text' may not be 'Bulleted text', 'Bullet sub' or 'Note'
which should be the same as: has to be 'Chapter' or 'Body text'. You
mentioned more types of para elements - maybe you could provide some
samples. Could you also tell whether 'Note' can be placed anywhere or is
it always part of a list. Anyhow here is the modified version which still
just takes the sources of your first post into account.

Regards,
Heiko


<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0">
  <xsl:output method="xml" encoding="UTF-8"/>

  <xsl:template match="/textflow">
    <book>
      <xsl:for-each-group select="para" group-starting-with="para[@pgftag
eq 'Chapter']">
        <xsl:apply-templates select="."/>
      </xsl:for-each-group>
    </book>
  </xsl:template>

  <xsl:template match="para[@pgftag eq 'Chapter']">
    <chapter>
      <title>
        <xsl:apply-templates select="*"/>
      </title>
      <xsl:for-each-group group-starting-with="para[@pgftag eq 'Body
text']| para[@pgftag eq 'Bulleted
text'][preceding-sibling::para[1][not(@pgftag = ('Bulleted text',
'Bullet sub', 'Note'))]]" select="current-group() except .">
        <xsl:apply-templates select="." mode="trigger"/>
      </xsl:for-each-group>
    </chapter>
  </xsl:template>

  <xsl:template match="para[@pgftag eq 'Body text']" mode="trigger">
    <p>
      <xsl:apply-templates select="*"/>
    </p>
  </xsl:template>

  <xsl:template match="para[@pgftag eq 'Bulleted text']" mode="trigger">
    <ul>
      <xsl:for-each-group select="current-group()"
group-starting-with="para[@pgftag = ('Bulleted text', 'Note')]">
        <xsl:apply-templates select="." mode="listitem"/>
      </xsl:for-each-group>
    </ul>
  </xsl:template>

  <xsl:template match="para[@pgftag eq 'Bulleted text']" mode="listitem">
    <li>
      <xsl:apply-templates select="*"/>
    </li>
    <xsl:if test="count(current-group()) &gt; 1">
      <ul>
        <xsl:for-each-group select="current-group() except ."
group-starting-with="para[@pgftag eq 'Bullet sub']">
          <xsl:apply-templates select="."/>
        </xsl:for-each-group>
      </ul>
    </xsl:if>
  </xsl:template>

  <xsl:template match="para[@pgftag eq 'Bullet sub']">
    <li>
      <xsl:apply-templates select="*"/>
    </li>
  </xsl:template>

  <xsl:template match="para[@pgftag eq 'Note']" mode="listitem">
    <note>
      <xsl:apply-templates select="*"/>
    </note>
  </xsl:template>

  <xsl:template match="paraline|xref">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="render">
    <xsl:choose>
      <xsl:when test="@charformat eq 'Emphasis'">
        <em>
          <xsl:value-of select="."/>
        </em>
      </xsl:when>
      <xsl:when test="@charformat eq 'Bold'">
        <b>
          <xsl:value-of select="."/>
        </b>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="."/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <!--next template for testing purposes only-->
  <xsl:template match="*">
    <bah/>
  </xsl:template>

</xsl:stylesheet>




> Heiko,
>
> thanks for your approach. While this helped me to grasp the grouping
> concept in XSLT 2.0 a little bit better (but only a little bit), I noticed
> that your solution is specifically tailored to the exact sequence of nodes
> that I used in the example file. But, unfortunately, life doesn't work
> that way:-) If, e.g. I put the first para[@pgftag='Body text'] down to
> right after the note, the stylesheet yields:
>
> <book>
>    <chapter>
>       <title>Introduction</title>
>       <p>Display the online help as follows:</p>
>       <ul>
>          <li>Display the online help as follows:</li>
>          <ul>
>             <li>To view the help for a panel, press the help (PF1)
> key.</li>
>          </ul>
>       </ul>
>       <p>This chapter expains...</p>
>       <ul>
>          <li>To view the help for an input field or select a parameter
> from a pop-up window, press PF1.</li>
>          <li>Check relevant sections of <em>XXXX</em>.</li>
>          <li>Visit our web site to get...</li>
>       </ul>
> [...]
>
> Clearly not what is intended. So what is the right strategy to solve this
> in a generic way? Sure, you'd want to group all sibling elements to
> chapter as long as they are not indicating a new chapter or a subdivision.
> But then? You cannot go for a group like
>
>       <xsl:for-each-group group-starting-with="para[@pgftag eq 'Body
> text']" select="current-group() except .">
>         <p>
>           <xsl:apply-templates select="*"/>
>         </p>
>         <ul>
>           <xsl:apply-templates select="."/>
>         </ul>
>       </xsl:for-each-group>
>
> when you don't know what first child element you have. Instead, I'd
> imagine templates that can match single paras and don't need no know about
> their context (like para[@pgftag='Body text'] or para[@pgftag='Note']. But
> as soon as you have list structures I currently don't see a solution that
> goes beyond grouping consecutive list items and if something like a note
> drops in, you have to regard the following list item as the first item of
> a new list.
> And here I'm back to square one. I cannot e.g. use
>
>       <xsl:for-each-group group-starting-with="para[@pgftag eq 'Bullet
> text']" select="current-group() except .">
>
> within the chapter template, since I don't know if and when such a list
> can be expected, so to me it looks more like:
>
>   <xsl:template match="para[@pgftag='Chapter']">
>         <chapter><title><xsl:apply-templates/></title>
>           <xsl:apply-templates select="all siblings that are in the same
> chapter until a subdivision comes up"/>
>         </chapter>
>    </xsl:template>
>
>    <xsl:template match="para[@pgftag='Bulleted text'][1]">
>       <ul>
>          <xsl:for-each-group select="following-sibling::*"
> group-adjacent="name()">
>         <xsl:apply-templates select="." mode="listitem"/>
>           </xsl:for-each-group>
>       </ul>
>    </xsl:template>
>
> But Saxon doesn't like that for-each-group and complains: "the only axes
> allowed in a pattern are the child and attribute axes".
> To complicate matters, subdivisions like '1st Section', ..., '4th Section'
> may be nested, so a chapter may consist either of simple content, as shown
> here, or some overview, followed by one or more subsections...
>>From what I've seen so far, I cannot deduce a generic solution to my
>> problem. Probably I'm still missing something elementary here...
>
> Thanks so far,
> Frank
>
>
> -----Original Message-----
> From: Heiko Niemann kontakt@xxxxxxxxxxxxxxxx
> [mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx]
> Sent: Mittwoch, 6. August 2014 18:09
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [xsl] Problems grouping nested items within a completely flat
> structure
>
> Hi,
>
> this should get you close to the desired result. Just add more templates
> necessary.
>
> Regards,
> Heiko
>
>
>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> version="2.0">
>   <xsl:output method="xml" encoding="UTF-8"/>
>
>   <xsl:template match="/textflow">
>     <book>
>       <xsl:for-each-group select="para" group-starting-with="para[@pgftag
> eq 'Chapter']">
>         <xsl:apply-templates select="."/>
>       </xsl:for-each-group>
>     </book>
>   </xsl:template>
>
>   <xsl:template match="para[@pgftag eq 'Chapter']">
>     <chapter>
>       <title>
>         <xsl:apply-templates select="*"/>
>       </title>
>       <xsl:for-each-group group-starting-with="para[@pgftag eq 'Body
> text']" select="current-group() except .">
>         <p>
>           <xsl:apply-templates select="*"/>
>         </p>
>         <ul>
>           <xsl:apply-templates select="."/>
>         </ul>
>       </xsl:for-each-group>
>     </chapter>
>   </xsl:template>
>
>   <xsl:template match="para[@pgftag eq 'Body text']">
>     <xsl:for-each-group group-starting-with="para[@pgftag = ('Bulleted
> text', 'Note')]" select="current-group() except .">
>       <xsl:apply-templates select="."/>
>     </xsl:for-each-group>
>   </xsl:template>
>
>   <xsl:template match="para[@pgftag eq 'Bulleted text']">
>     <li>
>       <xsl:apply-templates select="*"/>
>     </li>
>     <xsl:if test="count(current-group()) &gt; 1">
>       <ul>
>         <xsl:for-each-group select="current-group() except ."
> group-starting-with="para[@pgftag eq 'Bullet sub']">
>           <xsl:apply-templates select="."/>
>         </xsl:for-each-group>
>       </ul>
>     </xsl:if>
>   </xsl:template>
>
>   <xsl:template match="para[@pgftag eq 'Bullet sub']">
>     <li>
>       <xsl:apply-templates select="*"/>
>     </li>
>   </xsl:template>
>
>   <xsl:template match="para[@pgftag eq 'Note']">
>     <note>
>       <xsl:apply-templates select="*"/>
>     </note>
>   </xsl:template>
>
>   <xsl:template match="paraline|xref">
>     <xsl:apply-templates/>
>   </xsl:template>
>
>   <xsl:template match="render">
>     <xsl:choose>
>       <xsl:when test="@charformat eq 'Emphasis'">
>         <em>
>           <xsl:value-of select="."/>
>         </em>
>       </xsl:when>
>       <xsl:when test="@charformat eq 'Bold'">
>         <b>
>           <xsl:value-of select="."/>
>         </b>
>       </xsl:when>
>       <xsl:otherwise>
>         <xsl:value-of select="."/>
>       </xsl:otherwise>
>     </xsl:choose>
>   </xsl:template>
>
>   <!--next template for testing purposes only-->
>   <xsl:template match="*">
>     <bah/>
>   </xsl:template>
>
> </xsl:stylesheet>
>
>
>
>
>> Hi.
>>
>> While there is a lot of information about grouping available, I still
>> have problems applying it to my particular case of a document-centric
>> XML file.
>> Obviously I havenC"b,b"t yet understood it fully.
>>
>> This is my (redacted) source. Please excuse its length, but this
>> better illustrates my problem. I use Saxon 9HE, but I am open to both,
>> XSL 1.0 or
>> 2.0 solutions.
>>
>>       <textflow tftag=C"b,BAC"b,B>
>>          <para pgftag=C"b,BChapterC"b,B>
>>             <paraline>Introduction</paraline>
>>          </para>
>>          <para pgftag="Body text">
>>             <paraline>This chapter expains...</paraline>
>>          </para>
>>          <para pgftag="Bulleted text">
>>             <paraline>Display the online help as follows:</paraline>
>>          </para>
>>          <para pgftag="Bullet sub">
>>             <paraline>To view the help for a panel, press the help
>> (PF1) key.</paraline>
>>          </para>
>>          <para pgftag="Bullet sub">
>>             <paraline>To view the help for an input field or select a
>> parameter from a pop-up window, </paraline>
>>             <paraline>press PF1.</paraline>
>>          </para>
>>          <para pgftag="Note">
>>             <paraline>If you do not specify a required parameter, or
>> enter an incorrect one, XXXX </paraline>
>>             <paraline>will prompt you for the correct
>> information.</paraline>
>>          </para>
>>          <para pgftag="Bulleted text">
>>             <paraline>Check relevant sections of <render
>> charformat="Emphasis">XXXX</render>.</paraline>
>>          </para>
>>          <para pgftag="Bulleted text">
>>             <paraline>Visit our web site to get...</paraline>
>>          </para>
>>          <para pgftag="Body text">
>>             <paraline>The topics covered are:</paraline>
>>          </para>
>>          <para pgftag="Bulleted text">
>>             <paraline>
>>                <xref srctext="55167: 1st Section: What is
>> XXX?"><render charformat="Bold">What is XXX?</render></xref>
>>             </paraline>
>>          </para>
>>          <para pgftag="Bulleted text">
>>             <paraline>
>>                <xref srctext="55167: 1st Section: How Does XXX
>> Work?"><render charformat="Bold"> How Does XXX Work?</render></xref>
>>             </paraline>
>>          </para>
>>          <para pgftag=C"b,BChapterC"b,B>
>>             <paraline>Next chapter</paraline>
>>          </para>
>>      </textflow>
>>
>> The idea is, quite obviously, grouping the relevant list items, so
>> youC"b,b"d end up (ideally!) with something like e.g.:
>>
>> <book>
>>   <chapter>
>>      <title>Introduction</title>
>>      <p>This chapter explains...</p>
>>      <ul>
>>         <li>Display the online help as follows:</li>
>>         <ul>
>>            <li>To view the help for a panel, press the help (PF1)
>> key.</li>
>>            <li>To view the help for an input field or select a
>> parameter from a pop-up window, press PF1.</li>
>>        </ul>
>>         <note> If you do not specify a required parameter, or enter an
>> incorrect one, XXXX will prompt you for the correct
>> information.</note>
>>         <li>Check relevant sections of <em>XXXX</em>.</li>
>>         <li>Visit our web site to get ...</li>
>>      </ul>
>>      <p> The topics covered are:</p>
>>      <ul>
>>         <li><b>What is XXX?</b></li>
>>         <li><b>How Does XXX Work?</b></li>
>>      </ul>
>>    </chapter>
>>   <chapter>
>>      <title>Next chapter</title>
>>   </chapter>
>> </book>
>>
>>
>> As you can see, the source is a completely flat, linear sequence from
>> which I have to establish every kind of structure. Therefore, I use
>> something like
>>
>> <xsl:template match=C"b,BtextflowC"b,B>
>>   <book><xsl:apply-templates/></book>
>> </xsl:template>
>>
>> <xsl:template
>> match=C"b,Bpara[@pgftag=C"b,b"ChapterC"b,b"]C"b,B>
>>     <xsl:variable name="chapter-id" select="generate-id()"/>
>>     <chapter>
>>        <title><xsl:apply-templates/></title>
>>          <xsl:apply-templates
>> select="following-sibling::*[not(self::*[@pgftag='Chapter'])]
>>
>> [generate-id(preceding-sibling::para[@pgftag='Chapter'][1])
>> = $chapter-id]"/>
>>     </chapter>
>> </xsl:template>
>>
>> <xsl:template match=C"b,Bpara[@pgftag=C"b,b"Bulleted
>> textC"b,b"]C"b,B>...
>>
>> That is, I canC"b,b"t imagine having a single template matching
>> textflow
>> in which I apply <xsl:for-each-group> for all kinds of different paras.
>> Instead, I use Muenchian grouping (yep, starting with XSL 1.0, but now
>> I use 2.0), but ran into serious recursion trouble when fiddling with
>> nested chapter and list structures.
>> The other principal problem is how to decide when a structure has
>> ended, because all elements are on the same sibling axis. Now, a
>> chapter ends, when another <para pgftag=C"b,b"ChapterC"b,b"> or some
>> <para
>> pgftag=C"b,b"AppendixC"b,b"> appears. But there is no way to decide
>> when the
>> first bulleted list in the example really ends, since the list items
>> may include other elements such as notes or nested lists. You could
>> only use criteria such as C"b,EThis list has ended, when the next
>> paragraph is e.g.
>> <para @pgftag=C"b,b"Body textC"b,b"> or <para
>> @pgftag=C"b,b"ChapterC"b,b">
>> appearsC"b,B.
>>
>> Now, if anyone could point me in the right direction, IC"b,b"d be
>> very
>> grateful, since itC"b,b"s bugging me for some time now. And, please
>> apologize the length...
>>
>> Thank you,
>> Frank
>>
>>
>> Software AG C"b,b Sitz/Registered office: UhlandstraCE8e 12, 64297
>> Darmstadt, Germany C"b,b Registergericht/Commercial register:
>> Darmstadt
>> HRB 1562 - Vorstand/Management Board: Karl-Heinz Streibich
>> (Vorsitzender/Chairman), Dr. Wolfram Jost, Arnd Zinnhardt; -
>> Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Dr.
>> Andreas Bereczky - http://www.softwareag.com
>>
>>
>
>
> Software AG b Sitz/Registered office: UhlandstraCe 12, 64297 Darmstadt,
> Germany b Registergericht/Commercial register: Darmstadt HRB 1562 -
> Vorstand/Management Board: Karl-Heinz Streibich (Vorsitzender/Chairman),
> Dr. Wolfram Jost, Arnd Zinnhardt; - Aufsichtsratsvorsitzender/Chairman of
> the Supervisory Board: Dr. Andreas Bereczky - http://www.softwareag.com

Current Thread