Subject: Re: [xsl] select immediately following siblings with constraints? From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> Date: Thu, 23 Feb 2006 12:43:51 -0500 |
Cheers, Wendell
Hello,
This is a very long message, and I apologize for that. I've split this question into a short-ish version followed by a very long version. I am using XSL 2.0 draft syntax (as implemented by Saxon).
The short version of my question: Is it possible to write an XPath expression which selects the immediately adjacent following-siblings which adhere to a specific criteria? I'm using a combination of xsl:if and xsl:for-each-group (with group-adjacent), but I am wondering if there is a way to do it more directly.
Say I'm processing a list of elements and I want the 'a' elements which are adjacent to my current context:
<a id="1"/> <b id="2"/> <a id="3"/> <a id="4"/> <d id="5"/> <a id="6"/>
if current() is a[@id="1"], I want my XPath expression to evaluate to nothing since the element following current() (following-sibling::element()[1]) is '<b id="2"/>'.
If my input sequence was
<a id="1"/> <a id="3"/> <a id="4"/> <d id="5"/> <a id="6"/>
and current() is a[@id="1"], I'd want the expression to select the sequence
<a id="3"/> <a id="4"/>
I would not want it to include <a id="6"/>, since the chain was interrupted by '<d id="5"/>'. As I mentioned above, I'm currently handling this using combination of xsl:if and xsl:for-each-group. Is there another way?
If you have time to be looking over the long-winded version, I've tried to outline what I'm working on and where I'm getting confused on how to Do Things Properly.
I'm working out a stylesheet to perform a rather nasty operation: turning two sets of information selected from an SQL database into a hierarchy. I'm dealing with the aspect of performing a merge of adjacent nodes, and was hoping folks here would be willing to look my logic over and tell me if there are flaws, or if there is a much simpler way to handle the problem.
I've got the following sort of input (simplified), which as you can see is in a rather haphazard order.
<input> <sections> <!-- randomly ordered sections, we do know that all tier_num values greater than 1 will have a parent defined. --> <section id="1.2" tier_num="2" parent_id="1">Section CDG</section> <section id="1.1" tier_num="2" parent_id="1">Section ABC</section> <section id="2" tier_num="1">Section XYZ</section> <section id="1" tier_num="1">Section ZXY</section> </sections> <articles> <!-- articles are ordered as they should appear in output --> <article id="1" section="1.2"/> <article id="2" section="2"/> <article id="3" section="1.1"/> <article id="4" section="1"/> <article id="5" section="1.2"/> </articles> </input>
The real input can actually have an arbitrary number of tiered sections going to an arbitrary depth. Given the sample input, I need to output the articles in the listed order with the proper section information:
<output> <section> <heading>Section ZXY</heading> <section> <heading>Section CDG</heading> <article id="1"/> </section> </section> <section> <heading>Section XYZ</heading> <article id="2"/> </section> <section> <heading>Section ZXY</heading> <section> <heading>Section ABC</heading> <article id="3"/> </section> <article id="4"/> <section> <heading>Section CDG</heading> <article id="5"/> </section> </section> </output>
I've written a stylesheet which does this transformation, but I'm worried about both efficiency and about whether or not I'm missing a more obvious way to solve the problem.
My current working solution involves creating a series of node sets as variables, each variable annotated with more information and/or nesting than the previous, I've broken it down into the following variable creation steps:
1) generate list of sections with article children, 2) fill in any parent sections which are missing, 3) nest sections per parent_id, 4) merge sequential sections which share a section id,
The first 2 steps are dealing with lists of sections. I'm using xsl:for-each-group with the group-adjacent selector to merge adjacent sets of articles into single groups, and wrapping each group in its appropriate section. I'm then taking this flat list of sections and inserting any missing sections (ones which did not get created on the first pass because they didn't happen to have an associated article in them). I'm left with a flat list of sections:
<section id="1" tier_num="1"/> <section id="1.2" tier_num="2" parent_id="1"> <article id="1"/> </section> <section id="2" tier_num="1"> <article id="2"/> </section> <section id="1" tier_num="1"/> <section id="1.1" tier_num="2" parent_id="1"> <article id="3"/> </section> <section id="1" tier_num="1"> <article id="4"/> </section> <section id="1.2" tier_num="2" parent_id="1"> <article id="1.2"/> </section>
My next step is to nest them:
<xsl:variable name="nested_sections">
<xsl:apply-templates select="$all_sections/section[@tier_num = 1]" mode="nest_sections" />
</xsl:variable>
...
<xsl:template match="section" mode="nest_sections" as="element(section)">
<xsl:copy>
<xsl:sequence select="@*|*" />
<xsl:apply-templates
select="following-sibling::section[
(@tier_num = (current()/@tier_num + 1))
and (current() is preceding-sibling::section[@tier_num=current()/@tier_num][1])]"
mode="#current" />
</xsl:copy>
</xsl:template>
which leaves me with
<section id="1" tier_num="1"> <section id="1.1" tier_num="2" parent_id="1"> <article id="1"/> </section> </section> <section id="2" tier_num="1"> <article id="2"/> </section> <section id="1" tier_num="1"> <section id="1.1" tier_num="2" parent_id="1"> <article id="3"/> </section> </section> <section id="1" tier_num="1"> <article id="4"/> <section id="1.2" tier_num="2" parent_id="1"> <article id="1.2"/> </section> </section>
I then merge adjacent sections which share the same section id. I came up with two solutions. I believe the first one is flawed, and I'm unsure about the efficiency of the second. I'm calling both solutions with a named template which recurses if it detects that adjacent sections with identical ids remain:
<xsl:variable name="merged_sections">
<xsl:call-template name="merge_sections">
<xsl:with-param name="sections" select="$nested_sections"/>
</xsl:call-template>
</xsl:variable>
...
<xsl:template name="merge_sections">
<xsl:param name="sections" as="document-node()"/>
<xsl:choose>
<xsl:when test="exists($sections//section[@id = preceding-sibling::element()[1][self::section]/@id])">
<xsl:call-template name="merge_sections">
<xsl:with-param name="sections">
<xsl:apply-templates select="$sections" mode="merge_sections"/>
</xsl:with-param>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="$sections"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
As you can see, I basically tried to construct a "while" loop which tested for adjacent sections with identical ids. I used the XPath expression
//section[@id = preceding-sibling::element()[1][self::secton]/@id]
instead of
//section[@id = preceding-sibling::section[1]/@id]
to handle a case where two sections with identical ids are separated by one or more articles. For example:
<section id="1" tier_num="1"> <article id="1"/> <section id="1.2" tier_num="2" parent_id="1"> <article id="2"/> </section> <article id="3"/> <section id="1.2" tier_num="2" parent_id="1"> <article id="4"/> </section> </section>
This is a case where I do not want to perform a merge, as it would result in the articles being listed out of order (article 3 would end up coming after article 4).
To handle the actual merging, I came up with two solutions, the first has same sort of problem I was trying to deal with in my "while" loop test (using [1][self::section]). I'm at a loss at how to only select immediately adjacent following-siblings (note the plural). My 1st solution fails to account for that when calling xsl:apply-templates:
<!-- solution #1, discarded as being incorrect -->
<xsl:template match="section" mode="merge_sections" as="element(section)?">
<!--
If the immediately preceding element is a section which has the
same id, it should have already been handled. Otherwise we want
to copy it.
-->
<xsl:if test="not(preceding-sibling::element()[1][self::section][@id=current()/@id])">
<xsl:copy>
<xsl:sequence select="@*" />
<xsl:apply-templates select="*" mode="#current"/>
<!--
Merge in any adjacent sections which have the same id
-->
<xsl:apply-templates select="
following-sibling::section[
(@id = current()/@id)
and (current() is preceding-sibling::section[@id != current()/@id][1]/following-sibling::section[1])]/*"
mode="#current"/>
</xsl:copy>
</xsl:if>
</xsl:template>
My second solution combines xsl:if and xsl:for-each-group to try and ensure we only operate on the following-sibling section(s) if they are immediately adjacent and share the same section id:
<!-- solution #2, but is it inefficient? -->
<xsl:template match="section" mode="merge_sections" as="element(section)?">
<!--
If the immediately preceding element is a section which has the
same id, it should have already been handled. Otherwise we want
to copy it.
-->
<xsl:if test="not(preceding-sibling::element()[1][self::section][@id=current()/@id])">
<xsl:copy>
<xsl:sequence select="@*" />
<xsl:apply-templates select="*" mode="#current"/>
<xsl:variable name="context" select="current()"/>
<!--
Merge in any adjacent sections which have the same id
-->
<xsl:if test="following-sibling::element()[1][self::section][@id = current()/@id]">
<xsl:for-each-group select="following-sibling::section" group-adjacent="@id">
<xsl:if test="$context is current-group()[1]/preceding-sibling::element()[1]
and $context/@id = current-group()[1]/@id">
<xsl:apply-templates select="current-group()/*" mode="#current"/>
</xsl:if>
</xsl:for-each-group>
</xsl:if>
</xsl:copy>
</xsl:if>
</xsl:template>
So, if I understand what I am doing correctly, I am capturing the groups of following-sibling sections, then testing each grouping to see if the sequence is immediately adjacent to my $context, and that it has the same id.
Is there a better way, along the lines of
./following-sibling::element()[1][self::section]
where we are not limited to just the first element? Where we can select all of the consecutive following-sibling section elements? Is it wasteful to be running this grouping operation when all I really want are just the next handful of section elements?
Do you folks see a simpler way to approach the problems? Am I going about this the wrong way?
====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] select immediately following , James A. Robinson | Thread | RE: [xsl] select immediately follow, Michael Kay |
Re: [xsl] Fwd: Five xdt:* types ren, Frans Englich | Date | Re: [xsl] How to prevent that XML a, Wendell Piez |
Month |