Streaming XSL

Subject: Streaming XSL
From: "Oren Ben-Kiki" <oren@xxxxxxxxxxxxx>
Date: Tue, 23 Feb 1999 12:12:21 +0200
I've been thinking about using XSL on long documents (or "streams" as some
people call them :-). Currently any template may refer to any part of the
input document, so a naive XSL implementation would load the whole input
document into the memory, and then start applying templates. For very long
documents this would be a disaster, of course.

On the other hand, a smarter XSL processor might be able to recognize when a
part of the input tree has been completely processed - that is, when it is
guaranteed that no other template would ever match nodes in it - and release
it from memory. In addition, the processor could start emitting results as
soon as possible instead of waiting for the whole input tree to be loaded.
Together this would produce a "streaming" effect which could handle
arbitrarily long documents.

What I'm not certain of is whether it is possible, given XSL template
semantics, to create such an XSL implementation. I think that it is possible
given a DTD for the input document. I'm not certain about general input,
though. If this is impossible, or just very difficult to do, then it would
help if a "complete" attribute were added to <xsl:template>.


<xsl:template match="..." mode="..." complete>...</xsl:template>

Would declare that if this template matches an element of the input tree, it
is the only one to do so. No other template has matched it "before" nor
would any other template match it "afterwards". This makes this template the
specification of the _complete_ transformation for this particular input
tree element.

A streaming implementation would become trivial in the presence of such an
attribute - the processor could just dump the element after any "complete"
template matched it. Note that it would also be allowed to keep the element,
and it could complain if it was/will be matched by any other template. It
would also be possible to check whether a stylesheet with "complete"
templates is valid for a given input DTD, maybe as part of a test whether
the stylesheet converts a given input DTD into a given output one.

The down side is that it would require the designers to do such annotations
by hand. It would be better for the XSL processor to compute this

Has anyone given thought to this problem? Any XSL processor implementation
implemented along these lines?

Share & Enjoy,

    Oren Ben-Kiki

 XSL-List info and archive:

Current Thread