RE: [xsl] How to create a node set that excludes some descendant elements?

Subject: RE: [xsl] How to create a node set that excludes some descendant elements?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 12 Apr 2005 08:52:30 +0100
I think you are a little confused between the terms "node-set" and
"result-tree-fragment". (Not surprising, most people are).

You select a node-set using 

<xsl:variable name="n" select="--- some path expression ---"/>

The result is a set of nodes - I actually prefer to think of it as a set of
*references* to nodes. These are original nodes in the source document, and
they retain their original position in the source document, which means for
example that you can process each node in the set to ask how many ancestors
it has.

You create a result tree fragment (or in 2.0 terminology a temporary tree)
using

<xsl:variable name="n">
  -- some instructions ---
</xsl:variable>

The result is a new document. (The term "fragment" comes from DOM, and means
a document that isn't constrained to have a single element at the top
level.) The nodes in this tree are newly constructed nodes; they may be
copies of nodes in the source tree (or not) but they have separate identity
and have lost their relationships to other nodes in the source tree.

The example that you've given suggests that you do actually want to create a
new tree that is a selective copy of the original tree. The way to do this
is to walk the original tree applying templates. If a node is to be copied
into the new tree, you apply the identity template, if it is to be removed,
you apply an empty template, and of course you can also have templates that
modify selected elements. So it looks something like this:

<xsl:variable name="subset">
  <xsl:apply-templates select="a" mode="subset"/>
</xsl:variable>

<!-- by default, an element is copied -->

<xsl:template match="*" mode="subset">
  <xsl:copy><xsl:copy-of select="@*"/><xsl:apply-templates
mode="subset"/></xsl:copy>
</xsl:template>

<!-- I only want to include [only] the first <y> element that is contained
within 
  <c>, no matter where it occurs. There may be no <y> elements present. -->

<xsl:template match="y[not(. is (ancestor::c//y)[1])]" mode="subset"/>

<!-- I want to exclude all <z> elements that are contained within <c>, no 
  matter where they occur. Again, there may be none present. -->

<xsl:template match="z" mode="subset"/>

I used the XPath 2.0 "is" operator for comparing node identity here. The 1.0
equivalent of "A is B" is generate-id(A)=generate-id(B).

Michael Kay
http://www.saxonica.com/



> -----Original Message-----
> From: Rush Manbert [mailto:rush@xxxxxxxxxxx] 
> Sent: 12 April 2005 00:57
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] How to create a node set that excludes some 
> descendant elements?
> 
> Hi all,
> 
> My first post here and I want to start by saying how much I 
> appreciate 
> the big FAQ, the Jeni site, etc. It has all helped me tremendously.
> 
> I can't find an answer to this one, though, so here goes...
> 
> My XML doc has this basic structure:
> <a>
>   <b>
>     <c>
>       <!-- This is the section of interest -->
>     </c>
>   </b>
> </a>
> 
> The <c> element can contain any combination of elements <d> 
> through <z>. 
> Elements <y> and <z> have special uses.
> 
> I want to create a global variable that contains the result tree 
> fragment contained within element <c>, with the following 
> restrictions:
> I only want to include the first <y> element that is contained within 
> <c>, no matter where it occurs. There may be no <y> elements present.
> I want to exclude all <z> elements that are contained within <c>, no 
> matter where they occur. Again, there may be none present.
> 
> Later on in my stylesheet, I use exslt:node-set() on the variable and 
> process the node set.
> 
> For instance, given this source:
> <a><b><c>
>   <d>
>     <z>
>     <g />
>   </d>
>   <q>
>     <r>
>       <y />
>       <z />
>     </r>
>     <y />
>   <q>
>   <y>
> </c></b></a>
> 
> I want the selection to contain this:
> <a><b><c>
>   <d>
>     <g />
>   </d>
>   <q>
>     <r>
>       <y />
>     </r>
>   <q>
> </c></b></a>
> 
> (<z> elements are gone, only the first <y> element remains.)
> 
> I have tried many variations on the select portion of the variable 
> definition. I can filter the immediate children of <c>, OR the second 
> level children, etc., but I can't seem to come up with anything that 
> handles <y> and <z> appearing at any depth in the descendant tree.
> 
> I'm prepared to be humiliated by some obvious solution... Can anyone 
> please help?
> 
> Thanks,
> Rush

Current Thread