At 2009-06-17 23:11 +0300, Israel Viente wrote:
I really appreciate your code and comments, but after reading it many
times, I can't reach to the bottom of the logic here.
I'm a newbie so forgive my stupid questions.
As I tell my students, questions are not stupid if they are asked
sincerely. I far more appreciate the asking of questions than the
ignoring of working code that was supplied as requested.
1. Why do we need the outer most copy element:
> <xsl:template match="body">
> <xsl:copy>
In order to preserve the body element when it comes time to group the
paragraphs.
How does it work in combination with xsl:for-each-group?
By being the parent of the elements being grouped, matching on <body>
gives the stylesheet the opportunity to act on all of the children of
body. The paragraphs you want to massage are children of the body,
so the time to act on those children is at the time the body arrives
at the stylesheet. Since we want the body element to be part of the
result, we preserve it with <xsl:copy>.
2. Can you please explain the group-ending-with selection?
You can see by the select="*" that I have selected *all* of the
children of the body. I want to act on those groups of adjacent <p>
elements. But since there are other non-<p> elements that could be
in the data (there aren't any in your data, but how often is a web
page made solely of paragraphs?) I would be pulling those into the
selection as well. After all, I want all of the children of body to
be processed in child order, I only want to engage the special
handling when I'm dealing with those children that are paragraphs.
Yes your data sample only contained paragraphs, but I try to write my
stylesheets defensively anticipating other conditions.
Why do we need *[not(self::p)] ? Doesn't it mean all except p elements?
Indeed it does mean all except <p> elements. By putting non-<p>
elements in their own group, they won't interfere with the groups
that are comprised of <p> elements.
So, adding more narrative to the stylesheet:
<xsl:template match="body">
<xsl:copy>
<xsl:copy-of select="@*"/>
The above preserves the body element and any attributes that might be
attached to it.
<xsl:for-each-group select="*"
The above selects all of the children of the body.
group-ending-with="*[not(self::p)] |
p[span/@class='chapter'] |
p[matches(span[last()],
'[.?"]$')]">
The above creates a group for every non-paragraph, a group for every
chapter, and a group for every consecutive sequence of paragraphs and
ends that group with a paragraph with the desired punctuation.
<!--now the information is grouped by p elements that end as
required-->
<xsl:choose>
<xsl:when test="current-group()[last()]
[self::p][matches(span[last()],'[.?"]$')]">
The above tells me when I have encountered a group of <p> elements
that ends with a paragraph with the desired punctuation.
<!--in a group of p elements that end as required-->
<xsl:copy>
<xsl:copy-of select="@*"/>
The above preserves the *first* of those paragraphs, and its attributes.
<!--preserve the content of the first of these p elements-->
<xsl:apply-templates/>
The above preserves the content of that paragraph.
<!--preserve only the span elements and indentation from the
rest;
(the indentation is needed because this is paragraph
white-space)-->
<xsl:apply-templates select="current-group()[position()>1]/
(text()[not(normalize-space())] |
span)"/>
The above preserves only the content of the other paragraphs in the
group. If there are no other paragraphs in the group, nothing else
is added. If there are 15 other paragraphs in the group, all of the
content of all of them are added. This is the generalized nature of
the result: I'm not assuming that there is only one other paragraph.
</xsl:copy>
</xsl:when>
<xsl:otherwise>
<!--in another kind of group so just copy these using identity-->
<xsl:apply-templates select="current-group()"/>
The above preserves all of the children of <body> that are not
paragraphs or are chapter paragraphs.
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
I hope this has helped. Working directly with the sibling axes is
fraught with problems because of the reach of these axes usually past
where we want to stop looking. By looking *down* on the data, rather
than left and right, one can see a different perspective of your
requirement. You expressed your requirement by looking left and
right from the given paragraph. I expressed your requirement by
looking down at the paragraphs from the <body> parent.
Good luck in your work with XML and XSLT! As you learn more I'm sure
you'll love it more.
. . . . . . . . . . . . Ken
--
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson: http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview: http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers: http://www.CraneSoftwrights.com/legal