Re: [xsl] Copy-per-default

Subject: Re: [xsl] Copy-per-default
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Wed, 20 Apr 2011 12:04:46 -0400
Fredrik,

The behavior you describe is the normal and expected behavior for a stylesheet. It is also usually what we want -- at least once we've gotten used to it.

What you are missing is the existence of built-in templates, as described here (this is the simpler XSLT 1.0 explanation):

http://www.w3.org/TR/xslt#built-in-rule

These do two things:

1. Ensure that at no time is a node called for processing without a template to match it;

2. Ensure that by default, a traversal of the entire input tree is performed (with the results you are seeing), with text contents (at the leaves of the tree) eventually copied out.

The second is usually regarded as a good thing since it means that a document is always processed in toto even without any extra expenditure of effort on the programmer's part, and irrespective of variances in the organization of the source.

In fact, as you're finding, you have to go to extra effort to prevent elements from being processed when you don't want them. This is done through a combination of matching (for example, empty templates that match an element but produce no result for it) and explicit selecting of nodes to process, using xsl:apply-templates/@select or xsl:for-each/@select.

Colloquially, we describe the former means of control as "push" (since the source data is being "pushed" through a set of templates) and the latter as "pull" (since we are explicitly pulling data out from the source for processing). A little searching on these terms in the context of XSLT will teach you much.

Regards,
Wendell

On 4/20/2011 11:24 AM, Fredrik Bengtsson wrote:
Hi,

I am using FOP trunk to generate PDFs from DocBook documents on the command line. Fop.bat is doing the XSL transformation, using whatever engine fop uses (xalan?). I have written the XSLT entirely by myself, i.e. I am not using any default DocBook transform or similar. The transform is small and under my strict control.

I am having the problem that the transform does not behave as expected in two ways:
* Contents of nodes are being copied to the output as if there were some kind of identity transform in effect by default even though I have not written one, and
* Matches far down in the document cannot fetch data that existed earlier in the document, as if select="/x" selected the x post-transform instead of pre-transform


Imagine a document like this (ignoring namespaces etc for brevity):


<book>
   <titleabbrev>THEDOC</titleabbrev>
   <chapter>
     <title>Ch. 2: The chapter</title>
     <titleabbrev>Ch. 2</titleabbrev>
   </chapter>
</book>


If I have the following transforms in place:


<xsl:template match="/d:book">
   <!-- ignoring root, page-sequence etc for brevity -->
   <xsl:apply-templates />
</xsl:template>

<xsl:template match="d:chapter">
   <xsl:apply-templates />
</xsl:template>

<xsl:template match="d:chapter/d:title">
   <fo:block>  ... ...</fo:block>
</xsl:template>


Then for some reason the titleabbrev appears in the output even though I have not made any rule explicitly matching it. It is caught along with the title inside the apply-templates under d:chapter. I thought that this would not happen, unless I really added a matching template of some sort, for example an identity transform.



I then just for fun tried to add the following template:


<xsl:template match="*" />


That got rid of the offending titleabbrev, BUT it also had the effect of breaking another template that special-cases the first chapter:


<xsl:template match="d:chapter[1]">
   <xsl:variable name="abbr">
     <xsl:value-of select="/d:book/d:titleabbrev" />
   </xsl:variable>
   <!-- note: that selects a node that is higher up in the document -->
   <!-- now do something with $abbr -->
</xsl:template>


It seems that at that point, book/titleabbrev has already been transformed, i.e. removed due to the catch-all template above, so $abbr is empty. That strikes me as extremely strange; should the select not grab nodes from the original unmodified document? If I remove the catch-all, $abbr is set properly just as expected.


This is really confusing! And again - I am not using a huge third-party transform and modifying it, but rather using a really small, custom-written and strict one under my control.

/Fredrik



-- ====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================

Current Thread