Re: [xsl] Move elements to preceding parent

Subject: Re: [xsl] Move elements to preceding parent
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Wed, 17 Jun 2009 13:37:13 -0700
At 2009-06-17 23:11 +0300, Israel Viente wrote:
I really appreciate your code and comments, but after reading it many
times, I can't reach to the bottom of the logic here.
I'm a newbie so forgive my stupid questions.

As I tell my students, questions are not stupid if they are asked sincerely. I far more appreciate the asking of questions than the ignoring of working code that was supplied as requested.


1. Why do we need the outer most copy element:
> <xsl:template match="body">
>  <xsl:copy>

In order to preserve the body element when it comes time to group the paragraphs.


How does it work in combination with xsl:for-each-group?

By being the parent of the elements being grouped, matching on <body> gives the stylesheet the opportunity to act on all of the children of body. The paragraphs you want to massage are children of the body, so the time to act on those children is at the time the body arrives at the stylesheet. Since we want the body element to be part of the result, we preserve it with <xsl:copy>.


2. Can you please explain the group-ending-with selection?

You can see by the select="*" that I have selected *all* of the children of the body. I want to act on those groups of adjacent <p> elements. But since there are other non-<p> elements that could be in the data (there aren't any in your data, but how often is a web page made solely of paragraphs?) I would be pulling those into the selection as well. After all, I want all of the children of body to be processed in child order, I only want to engage the special handling when I'm dealing with those children that are paragraphs.


Yes your data sample only contained paragraphs, but I try to write my stylesheets defensively anticipating other conditions.

Why do we need *[not(self::p)] ? Doesn't it mean all except p elements?

Indeed it does mean all except <p> elements. By putting non-<p> elements in their own group, they won't interfere with the groups that are comprised of <p> elements.


So, adding more narrative to the stylesheet:

 <xsl:template match="body">
  <xsl:copy>
    <xsl:copy-of select="@*"/>

The above preserves the body element and any attributes that might be attached to it.


<xsl:for-each-group select="*"

The above selects all of the children of the body.


                        group-ending-with="*[not(self::p)] |
                                           p[span/@class='chapter'] |
                                           p[matches(span[last()],
                                                     '[.?&#x22;]$')]">

The above creates a group for every non-paragraph, a group for every chapter, and a group for every consecutive sequence of paragraphs and ends that group with a paragraph with the desired punctuation.


      <!--now the information is grouped by p elements that end as
 required-->
      <xsl:choose>
        <xsl:when test="current-group()[last()]
                        [self::p][matches(span[last()],'[.?&#x22;]$')]">

The above tells me when I have encountered a group of <p> elements that ends with a paragraph with the desired punctuation.


          <!--in a group of p elements that end as required-->
          <xsl:copy>
            <xsl:copy-of select="@*"/>

The above preserves the *first* of those paragraphs, and its attributes.


            <!--preserve the content of the first of these p elements-->
            <xsl:apply-templates/>

The above preserves the content of that paragraph.


            <!--preserve only the span elements and indentation from the
 rest;
                (the indentation is needed because this is paragraph
                 white-space)-->
            <xsl:apply-templates select="current-group()[position()>1]/
                                         (text()[not(normalize-space())] |
                                         span)"/>

The above preserves only the content of the other paragraphs in the group. If there are no other paragraphs in the group, nothing else is added. If there are 15 other paragraphs in the group, all of the content of all of them are added. This is the generalized nature of the result: I'm not assuming that there is only one other paragraph.


          </xsl:copy>
        </xsl:when>
        <xsl:otherwise>
          <!--in another kind of group so just copy these using identity-->
          <xsl:apply-templates select="current-group()"/>

The above preserves all of the children of <body> that are not paragraphs or are chapter paragraphs.


        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each-group>
  </xsl:copy>
 </xsl:template>

I hope this has helped. Working directly with the sibling axes is fraught with problems because of the reach of these axes usually past where we want to stop looking. By looking *down* on the data, rather than left and right, one can see a different perspective of your requirement. You expressed your requirement by looking left and right from the given paragraph. I expressed your requirement by looking down at the paragraphs from the <body> parent.


Good luck in your work with XML and XSLT! As you learn more I'm sure you'll love it more.

. . . . . . . . . . . . Ken

--
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal

Current Thread