Re: [xsl] question about identity transform

Subject: Re: [xsl] question about identity transform
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Tue, 31 Oct 2006 21:48:44 +0100
Bill French wrote:

I've used the identity transform many times to do useful things and have often wondered about the match pattern. Why is the match pattern


"node() | @*"

rather than simply "node()"? Aren't attributes returned by node()?

Node( ) matches attributes, but it defaults to the child axis, so the above is actually:


"child::node( ) | attribute::*"

The order of the match pattern is important. The identity-pattern looks in full as follows (implementations vary):

<xsl:template match="node(  ) | @*">
 <xsl:copy>
   <xsl:apply-templates select="@* | node(  )"/>
 </xsl:copy>
</xsl:template>

Chopped into pieces:

<xsl:template match="node( ) | @*">
Order is not important here. I tend to consider the `|' operation an "or" operation, it helps my understanding (in fact, I believe it works the same way as the union, later). All it says: if the input tree has a node(), go here, if it has an attribute, go here too.


What is important in this declaration is that it must be as generic as possible. Because the template matching rules always give precedence to a more specific match, this very generic template will always be overriden whenever you declare something more specific, like a node-name or an attribute name.

<xsl:copy>
Copy the node. In effect, this means, it will *only* copy the node and not its descendants or attributes. It will, however, copy the namespace, if any. You could consider using <xsl:copy-of />, but that would change the identity template rather drastically and will not allow you to override any nodes or attributes by just declaring a more specific template.


<xsl:apply-templates select="@* | node( )"/>
This, ultimately, says: "take all current attributes, and find any declared templates for these attributes. When done, take all descendant nodes, along the child-axis, and find any declaraed templates for these nodes. " The `|' operator is the union operator and combines both node sets.


What is tricky here is that node( ) on itself can match anything. But when used in an select expression, it can only select items along its child axis (I peeked, it's detailed in XPath 2.0 programmer's reference page 280). The attribute-axis is never on the child-axis. As such, you will need to specify the items on the attribute-axis explicitly.

If you would refers it, and make it, say: ` node ( ) | @* ' the nodes would first be copied, resulting in the attributes ending up illegally: you may not create an attribute node after you create an element node in the result tree. However, I'm not sure if this is really enforced by the specs.


I hope I got the bits and pieces right (no doubt someone will correct me if otherwise) and that it clears things up a bit for you.


Cheers,
-- Abel Braaksma
  http://www.nuntia.com

Current Thread