Re: Efficient Stylesheets for reordering

Subject: Re: Efficient Stylesheets for reordering
From: Joe English <jenglish@xxxxxxxxxxxxx>
Date: Thu, 09 Nov 2000 12:10:05 -0800
Jose Alberto Fernandez wrote:

> I would like to have some discussion, at the semantic level, on identifying
> a subset of the XSLT language that can be process using a DFS approach.
> Can a useful subset of the language be identified.

One such restriction would be that:

    (1) <xsl:apply-templates>, <xsl:for-each>, and <xsl:call-templates>
	may only use the self, child, descendant-or-self, descendant,
	and attribute axes in the 'select' attribute;

    (2) The 'match' attribute of <xsl:template> may only use the
        'self' and 'attribute' axes inside predicates ('match'
	patterns are already subject to restriction (1) outside
	of predicates);

    (3) The key() and id() functions are not allowed (possibly
        plus a few others that I've missed; the basic idea is that
	XPath expressions are restricted so that they only select
	nodes from the subtree of the current node).

    (4) <xsl:template>, <xsl:for-each>, and <xsl:copy> instructions
        may contain at most one <xsl:for-each>, <xsl:apply-templates>,
	<xsl:call-templates>, or <xsl:copy> instruction (although
	<xsl:for-each> and <xsl:copy> could contain a (single) nested
	<xsl:for-each> or <xsl:copy>).

    (5) Some sort of restriction on <xsl:copy-of>, <xsl:with-parameter>,
        <xsl:variable>, et cetera -- I haven't completely thought this
	through yet :-)

With these restrictions, it should be possible to execute an XSL
transformation keeping only the current node and its ancestors
in memory (the "SAX filter and a stack" design pattern).

With a bit of preprocessing you could also convert all the
<xsl:template> match patterns into a finite state automaton
and remove the need for even an ancestor stack, though I'd
guess that the automaton would require much more memory than
the ancestor list would.

With lazy evaluation, restriction (4) could be lifted
(and restriction (2) loosened) with minimal impact
on the space requirements -- laziness would ensure
that the processor only looks as far ahead as it
needs to to produce the output, and the garbage
collector would automatically clean up the trailing
context when it's no longer needed.

[ ... ]


> If XSLT is the wrong tool, are there any other XML transformation languages,
> in the works, that may be better suited for this kind of work?

Not just in the works, but dozens are already implemented!
For space-efficient processing, Omnimark in particular comes
to mind.  There's also the "roll-your-own transformation
language" using <scripting language of your choice> with
<SAX-like package of your choice> approach.  Tcl/TclXML
is my personal favorite -- Tcl makes it easy to invent new
language constructs so transformations can be specified
declaratively at a higher level.


--Joe English

  jenglish@xxxxxxxxxxxxx


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread