Re: [xsl] Q on incremental processing and count()

Subject: Re: [xsl] Q on incremental processing and count()
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Tue, 19 Feb 2002 13:18:32 +0000
Hi Enke,

> Normally the XSLT machine should emit the transformed code as soon
> as possible. But I encountered that this is broken (at least for
> Xalan-J) if I count nodes with count("node") for nodes which where
> earlier in the stream.

In fact "normally" (given the processing model for XSLT), the XSLT
processor should construct the entire tree representing the source XML
prior to starting the transformation, and build up the entire tree
representing the result of the transformation prior to serializing
that result.

This processing model reflects the fact that the stylesheet may need
to access information about the entire source document rather than
just the part that it's already seen (for example if you wanted to
count all the nodes in the source XML). And it reflects the fact that
parts of the result might be constructed through parallel processes
and only collected together at the last moment.

A clever processor might take the source as a stream and construct as
much of the result as it can given the information that it has
obtained from the source, gradually building up the result tree, and
emitting as much of the result tree as it reasonably can do at any
point. It sounds as if Xalan is managing to be this clever when you
don't include a count() function in your stylesheet.

However, from the symptoms that you describe (I'm guessing here - I
don't know how Xalan works internally), as soon as you do include a
count() function, Xalan thinks that the stylesheet involves using
information that is not yet accessible to it, and so reverts to the
more common processing model where it constructs the whole source tree
prior to transformation.

If the incremental processing is very important to you, what I suggest
you do is talk to the implementers of Xalan-J to find out the details
of what features a stylesheet must have to allow incremental
processing to occur. It might be, for example, that xsl:number is
allowed but count() isn't, or that you can count nodes before the
current node in document order but not after it (which would mean you
could count the columns in the template matching the tbody element
fine, and it was the fact you were doing the count within the template
matching the table element that was causing the problem).

Also, you might have to turn incremental processing on explicitly,
from what I can see at:

  http://xml.apache.org/xalan-j/dtm.html#incremental

and:
  
  http://xml.apache.org/xalan-j/usagepatterns.html#incremental

Another (complementary) approach would be to rethink the way that the
application as a whole works. Perhaps you can get your database to
generate 20 results at a time rather than 200. Perhaps there are other
things in your stylesheet that you can optimise such that the
processing time is greatly reduced and it therefore doesn't matter
whether it's generated incrementally or not.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread