Subject: RE: [xsl] <br/> to <p> and optimization From: "Michael Kay" <mhk@xxxxxxxxx> Date: Wed, 2 Jul 2003 21:41:40 +0100 |
Yes, your analysis of this is correct. It's the kind of expression that some optimizers are likely to handle much better than others, so it may be worth trying a couple of different XSLT processors; but in general it's quite likely to have O(n^2) performance with respect to the number of sibling nodes. The other approach to this problem is the recursive walk through the siblings. This is more likely to have linear performance, but with a large number of siblings it can blow the stack size. The more recent releases of Saxon use tail call optimization on call-template and apply-templates which can eliminate this problem. (Some other processors use it too, but not all.) If you need to do this repeatedly it may be worth looking at non-XSLT solutions. This kind of problem is often fairly easy to tackle with a SAX filter, because it's purely linear and doesn't require much analysis of the context. You could also look at the new STX tools for doing serial transformations. Alternatively you could try an XSLT 2.0 solution. Essentially the code is very simple: <xsl:for-each-group select="*" group-starting-at="br"> <p> <xsl:copy-of select="current-group()[not(self::br)]"/> </p> </xsl:for-each-group> But I don't know how it will perform in Saxon - the grouping facilities haven't received very much attention from a performance perspective yet. If you get any data on this, I would love to know. Michael Kay > > Hello, > > On the archives of this list I have found a solution to the > problem of putting all elements between two <br/> elements > into a <p> element: > http://www.biglist.com/lists/xsl-list/archives/200101/msg00865 .html However, this process takes a very very long time for "big" files (over 100k) which have lots of brs (up to two minutes), and I am looking for a way to optimize it. In fact my problem is I'm not sure I correctly understand the following line: <xsl:variable name="content" select="preceding-sibling::node() [not($br-before) or generate-id(preceding-sibling::br[1]) = generate-id($br-before)]" /> $br-before is the preceding <br/>: <xsl:variable name="br-before" select="preceding-sibling::br[1]" /> So, for setting $content, do we mean that we test _all_ nodes before the current <br/>, and for each of them we test that they are not themselves the preceding <br/> (not($br-before)) and that they are actually after the same <br/> than the one located by $br-before? In that case obviously we test the same nodes many times: for every new <br/>, we want to add nodes that are before the current <br/> and after the preceding one, but we test again the nodes that are before the last <br/> up to the start of the containing element. Therefore what we need is a way to "stop" the selection once the current node that is being tested is in fact $br-before? Is this correct? Regards, Emmanuel Bégué XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] <br/> to <p> and optimization, Emmanuel Bégué | Thread | [xsl] Re: <br/> to <p> and optimiza, Dimitre Novatchev |
[xsl] passing parameter to template, Fei Zheng | Date | RE: [xsl] Namespaces II, Michael Kay |
Month |