RE: [xsl] Dynamic pipelining in XSLT 2.0 w/ Saxon extensions

Subject: RE: [xsl] Dynamic pipelining in XSLT 2.0 w/ Saxon extensions
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 19 Jun 2007 08:58:27 +0100
> * This runs (tested under the current Saxon 8.9), but how 
> will it scale? In particular, Mike Kay may be able to say 
> whether compiled stylesheets are be cached when this is run 
> over a set of documents. 
> If not, wouldn't compiling each stylesheet anew for each 
> input document be an impediment?

I'm not sure of the architecture you are using. If you start a new
"transformation" to process each "set of documents", then the variables
associated with the transformation (which include any compiled stylesheets)
will of course be lost. On the other hand, if you process multiple "sets of
documents" within a single transformation, then the problem is that the
documents will be held in memory unless you explicitly discard them using
saxon:discard().

At present certain things are never "early-evaluated" in Saxon, even if all
the arguments are known at compile time. These include the doc() function
and extension functions - the theory being that the results of these calls
might depend on external information that changes between compile time and
run-time. So a variable such as <xsl:variable name="process"
select="saxon:compile-stylesheet(doc($stylesheet))"/>, even if promoted to
be a global variable, would be evaluated anew on each transformation. It
would be nice to provide additional options in this area - not just for this
use case, but a more general capability.
> 
> * Are there any obvious pitfalls or problems with this 
> approach? (Or any not so obvious?) How does it compare to 
> other methods?

I'm inclined to think that a general purpose pipeline processor will do the
job better. It's likely to have memory management that's better adapted to
this kind of work, and debugging facilities to examine the documents at any
stage of the pipeline or to switch validation of intermediate steps on and
off, etc. If you're lucky it might even allow distributed or asynchronous
execution of the pipeline.

Michael Kay
http://www.saxonica.com/

Current Thread