|
Subject: Re: [xsl] Move elements to preceding parent From: Israel Viente <israel.viente@xxxxxxxxx> Date: Thu, 18 Jun 2009 18:15:26 +0300 |
Hi Ken,
I tried to test the stylesheet with non-<p> elements inside body and I
see they break the paragraph.
Input Example:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<p dir="rtl">
<span class="chapter">line1</span>
</p>
<p dir="rtl"><span class="regular">line10</span>
<span class="regular">line11</span>
</p>
<p dir="rtl"><span class="regular">line12</span>
</p>
<p dir="rtl"><span class="regular">line13.</span>
</p>
<p dir="rtl"><span class="regular">line14</span>
</p>
<p dir="rtl"><span class="regular">line15</span>
</p>
<h5>
<img src="images/test.jpg" width="35.00" height="30.00" alt="images.jpg"
/>
</h5>
<p dir="rtl"><span class="regular">line16.</span>
</p>
<p dir="rtl"><span class="regular">line17"</span>
</p>
</body>
</html>
Output:
<?xml version="1.0" encoding="UTF-8"?><html
xmlns="http://www.w3.org/1999/xhtml">
<body>
<p dir="rtl">
<span class="chapter">line1</span>
</p>
<p dir="rtl"><span class="regular">line10</span>
<span class="regular">line11</span>
<span class="regular">line12</span>
<span class="regular">line13.</span>
</p>
<p dir="rtl"><span class="regular">line14</span>
</p>
<p dir="rtl"><span class="regular">line15</span>
</p>
<h5>
<img src="images/test.jpg" width="35.00" height="30.00"
alt="images.jpg" />
</h5>
<p dir="rtl"><span class="regular">line16.</span>
</p>
<p dir="rtl"><span class="regular">line17"</span>
</p>
</body>
</html>
I thought the <h5> element should be grouped as a seperate group
because of the condition group-ending-with="*[not(self::p)] ...
What should I change so the output will be:
<?xml version="1.0" encoding="UTF-8"?><html
xmlns="http://www.w3.org/1999/xhtml">
<body>
<p dir="rtl">
<span class="chapter">line1</span>
</p>
<p dir="rtl"><span class="regular">line10</span>
<span class="regular">line11</span>
<span class="regular">line12</span>
<span class="regular">line13.</span>
</p>
<p dir="rtl"><span class="regular">line14</span>
<span class="regular">line15</span>
<span class="regular">line16.</span>
</p>
<h5>
<img src="images/test.jpg" width="35.00" height="30.00"
alt="images.jpg" />
</h5>
<p dir="rtl"><span class="regular">line17"</span>
</p>
</body>
</html>
Thanks, Israel
On Wed, Jun 17, 2009 at 11:37 PM, G. Ken
Holman<gkholman@xxxxxxxxxxxxxxxxxxxx> wrote:
> At 2009-06-17 23:11 +0300, Israel Viente wrote:
>>
>> I really appreciate your code and comments, but after reading it many
>> times, I can't reach to the bottom of the logic here.
>> I'm a newbie so forgive my stupid questions.
>
> As I tell my students, questions are not stupid if they are asked
sincerely.
> I far more appreciate the asking of questions than the ignoring of working
> code that was supplied as requested.
>
>> 1. Why do we need the outer most copy element:
>> > <xsl:template match="body">
>> > <xsl:copy>
>
> In order to preserve the body element when it comes time to group the
> paragraphs.
>
>> How does it work in combination with xsl:for-each-group?
>
> By being the parent of the elements being grouped, matching on <body> gives
> the stylesheet the opportunity to act on all of the children of body. The
> paragraphs you want to massage are children of the body, so the time to act
> on those children is at the time the body arrives at the stylesheet. Since
> we want the body element to be part of the result, we preserve it with
> <xsl:copy>.
>
>> 2. Can you please explain the group-ending-with selection?
>
> You can see by the select="*" that I have selected *all* of the children of
> the body. I want to act on those groups of adjacent <p> elements. But
> since there are other non-<p> elements that could be in the data (there
> aren't any in your data, but how often is a web page made solely of
> paragraphs?) I would be pulling those into the selection as well. After
> all, I want all of the children of body to be processed in child order, I
> only want to engage the special handling when I'm dealing with those
> children that are paragraphs.
>
> Yes your data sample only contained paragraphs, but I try to write my
> stylesheets defensively anticipating other conditions.
>
>> Why do we need *[not(self::p)] ? Doesn't it mean all except p elements?
>
> Indeed it does mean all except <p> elements. By putting non-<p> elements
in
> their own group, they won't interfere with the groups that are comprised of
> <p> elements.
>
> So, adding more narrative to the stylesheet:
>
>> <xsl:template match="body">
>> <xsl:copy>
>> <xsl:copy-of select="@*"/>
>
> The above preserves the body element and any attributes that might be
> attached to it.
>
>> <xsl:for-each-group select="*"
>
> The above selects all of the children of the body.
>
>> group-ending-with="*[not(self::p)] |
>> p[span/@class='chapter'] |
>> p[matches(span[last()],
>> '[.?"]$')]">
>
> The above creates a group for every non-paragraph, a group for every
> chapter, and a group for every consecutive sequence of paragraphs and ends
> that group with a paragraph with the desired punctuation.
>
>> <!--now the information is grouped by p elements that end as
>> required-->
>> <xsl:choose>
>> <xsl:when test="current-group()[last()]
>> [self::p][matches(span[last()],'[.?"]$')]">
>
> The above tells me when I have encountered a group of <p> elements that
ends
> with a paragraph with the desired punctuation.
>
>> <!--in a group of p elements that end as required-->
>> <xsl:copy>
>> <xsl:copy-of select="@*"/>
>
> The above preserves the *first* of those paragraphs, and its attributes.
>
>> <!--preserve the content of the first of these p elements-->
>> <xsl:apply-templates/>
>
> The above preserves the content of that paragraph.
>
>> <!--preserve only the span elements and indentation from the
>> rest;
>> (the indentation is needed because this is paragraph
>> white-space)-->
>> <xsl:apply-templates select="current-group()[position()>1]/
>> (text()[not(normalize-space())] |
>> span)"/>
>
> The above preserves only the content of the other paragraphs in the group.
> If there are no other paragraphs in the group, nothing else is added. If
> there are 15 other paragraphs in the group, all of the content of all of
> them are added. This is the generalized nature of the result: I'm not
> assuming that there is only one other paragraph.
>
>> </xsl:copy>
>> </xsl:when>
>> <xsl:otherwise>
>> <!--in another kind of group so just copy these using identity-->
>> <xsl:apply-templates select="current-group()"/>
>
> The above preserves all of the children of <body> that are not paragraphs
or
> are chapter paragraphs.
>
>> </xsl:otherwise>
>> </xsl:choose>
>> </xsl:for-each-group>
>> </xsl:copy>
>> </xsl:template>
>
> I hope this has helped. Working directly with the sibling axes is fraught
> with problems because of the reach of these axes usually past where we want
> to stop looking. By looking *down* on the data, rather than left and
right,
> one can see a different perspective of your requirement. You expressed
your
> requirement by looking left and right from the given paragraph. I
expressed
> your requirement by looking down at the paragraphs from the <body> parent.
>
> Good luck in your work with XML and XSLT! As you learn more I'm sure
you'll
> love it more.
>
> . . . . . . . . . . . . Ken
>
> --
> Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
> Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
> Video lesson: http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
> Video overview: http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
> G. Ken Holman mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
> Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/s/bc
> Legal business disclaimers: http://www.CraneSoftwrights.com/legal
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Move elements to precedin, Israel Viente | Thread | Re: [xsl] Move elements to precedin, G. Ken Holman |
| RE: [xsl] Recursive for loop & xslt, Michael Kay | Date | [xsl] XSL-FO Processor for Arabic T, Paul Spencer |
| Month |