Re: [xsl] XSLT streaming: the processor "remembers" things as it descends the XML tree?

Subject: Re: [xsl] XSLT streaming: the processor "remembers" things as it descends the XML tree?
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx>
Date: Wed, 20 Nov 2013 10:25:48 -0500

I cannot resist writing color commentary to Ken's excellent explanation.

On Wed, Nov 20, 2013 at 8:38 AM, G. Ken Holman
<gkholman@xxxxxxxxxxxxxxxxxxxx> wrote:
>> If my XSLT program accesses ancestor nodes, that seems to require the XSLT
>> processor to back up. And isn't that a violation of the fundamental law of
>> streaming, "The XSLT processor shall not back up"?
> I don't think so ... you haven't finished with that ancestor yet, so you
> really haven't left it.  It is still "open" in that you are still in the
> process of acting on that ancestor.  I think "backing up" is the issue only
> once you have left the node and moved on.  In the case of ancestors, you
> haven't left them yet.

Indeed. Remember that the XPath 'preceding' axis includes only nodes
that have started *and finished* (opened and closed) before the
context node.

(As Ken also follows up to remark.)

>> Or, perhaps my XSLT program can access ancestor nodes because, as the XSLT
>> processor descends the XML tree it keeps a record of each node through which
>> it descends (the node's name, its attributes, its namespaces). Yes, that
>> must be what the XSLT processor does. Suppose that my XML tree is very deep,
>> then the XSLT processor will have to remember a lot of stuff, right? In the
>> extreme case, every node in the XML document has no siblings, just one
>> child. Thus, the XSLT processor would have to remember the entire XML
>> document, right?
> I think you are putting too much into "remembering" for ancestors.  And XML
> documents are typically flat (though my course material happens to go quite
> deep).

I think it would be very interesting to see a survey of how deep XML
documents go in the wild. Except for pathological cases, I think they
would rarely go beyond 20 deep. Of course this will vary a great deal
by document type.

Ken, what does "quite deep" mean in your case?

> I think streaming is a benefit for flatter XML documents ... I certainly
> would not think that a skinny and deep XML document with only one leaf node
> and the thousands of branches would be seen in the wild, nor would it
> benefit from streaming.

Or one leaf on one very long branch :-> if you prefer.

This is related to the rule of thumb that streaming will help with
memory management, but will not (perhaps contrary to expectations)
have much impact on processing speed. Streaming is useful when
branches of the document can be processed without reference to other
branches. The smaller your branches (each of which offers a discrete
processing context), the more you benefit. Note that in this way of
thinking, every "branch" goes all the way to the root.

Cheers, Wendell

Wendell Piez |
XML | XSLT | electronic publishing
Eat Your Vegetables

Current Thread