Subject: RE: [xsl] optimizing the hell out of XSLT :-) From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx> Date: Thu, 12 Dec 2002 10:45:58 -0000 |
> Hi Michael and other XSLT engine builders, > > among all this lazy evaluation and optimization how about > piping SAXON events right into templates building only > partial trees? Suppose a transform doesn't use global forward > access with XPath but only accesses the local area in the > current node and nodes seen before. Could be a bit faster > that way, and more memory efficient? Are you guys thinking of > doing this? Xalan has a mode where the XSLT transformer and the parser work in parallel, with Xalan starting to transform the tree before Xerces has finished building it. But the real problem is knowing when it's safe to discard parts of the tree that have already been processed, knowing that they won't be needed again. In principle it's possible for an optimizer to statically recognize stylesheets that would allow this. I have always suspected, though, that not many real stylesheets would satisfy the criteria. Optimizations of this kind tend to be pretty fragile. In early releases of Saxon, there was a serial processing mode. Maintaining two very different modes of processing in one product greatly distorted the structure of the code and reduced the opportunity to implement other optimizations, so in the end I decided to get rid of it. In the light of this experience, I've argued in the past that serial transformation requires a different processing model and a different language. There's now an effort going on to develop such a language, called STX: see http://stx.sourceforge.net > > I tried the preview mode in Saxon and that is nice in some > special cases, certainly. But think this could be done > generally. I'm not telling any news I guess. > > As for a related matter, how about piping data through > variables? Consider this: > > <xsl:variable name="intermediary"> > <xsl:apply-templates mode="preprocess" select="."/> > </xsl:variable> <xsl:apply-templates select="$intermediary"/> > > basically from what I understand about Saxon internals, the > variable body is evaluated by setting the Outputter/Emitter > to the TinytreeBuilder, then the whole thing gets buffered in > a new tree. When that is finished, the second apply-templates is run. > > Now, with SAX one cound pipe events right from the result of > applying . in preprocess mode to the next apply-templates. > Again, if there is no forward searching XPath it should go > right through. > > One notch more: even if there is XPath trying to look ahead a > little bit, one could wait until enough of a tree has been > built. Even if an XPath expression does look ahead through > the whole document (e.g., select="//something") one could > apply that same lazy-evaluation principle (the SequenceIntent > class is already a start.) Once again, I think that there's no real saving, at least on a single processor, unless you can also find a way to discard parts of the tree after they have been processed. > > Finally, finally, one could have the various apply-template > jobs run in different threads on different CPUs, even through > a network on different machines. > You've got to weigh up several engineering decisions here: there's a coordination overhead as soon as you use multiple threads, and there's a cost of complexity in the code. Every optimization introduces new bugs, and optimizations involving multithreading are particularly difficult to test thoroughly (and to debug) because multithreading introduces an element of non-determinancy. One of the reasons that "one-man" efforts like xt, saxon, and jd.xslt have proved so effective is that it's much easier to enforce a "keep it simple" discipline when you are in sole charge of the project. Michael Kay Software AG home: Michael.H.Kay@xxxxxxxxxxxx work: Michael.Kay@xxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] optimizing the hell out of XS, Gunther Schadow | Thread | Re: [xsl] optimizing the hell out o, Kevin Jones |
RE: [xsl] selecting attribute name, TSchutzerWeissmann | Date | RE: [xsl] selecting attribute name, Michael Kay |
Month |