| Subject: [xsl] Re: Re: Re: Re: Unbounded element grouping/concatenation From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx> Date: Fri, 12 Dec 2003 11:50:33 +0100 | 
"Gupta, Raman K [CI]" <raman.k.gupta@xxxxxxxxxxxxx> wrote in message news:C0CB45C72DDE2A4FA32370AB65CAEC8F059673@xxxxxxxxxxxxxxxxxxxxxxxxxx [Nice timing results omitted] > So all of the methods except recursion exhibit exponential > behavior with an increase in continuation records except > recursion. > > So if there is a way to do solve this problem recursively > that would avoid the stack overflow errors (2.5.2 doesn't > even do you the courtesy of throwing an exception -- it > just dies in the middle of the transformation), that would > still be by far the best solution. > Hi Raman, I have both bad and good news for you. The Bad News: The recursive algorithm isn't the fastest. The Good News: I am enclosing here the code implementing the fastest (that I know of) algorithm, anf it is non-recursive. The idea is to get a string with the positions of all "record" nodes with type="normal" among all "record nodes", then for a record with type="normal" find its position in the string and also the position of the next record with type="normal" The difference in these two positions - 1 is the number of intermediate record nodes with type="continuation". Here's the XSLT code: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:variable name="vposArray"> <xsl:value-of select="'|'"/> <xsl:for-each select="/*/record"> <xsl:if test="@type = 'normal'"> <xsl:value-of select="concat(position(), '|')"/> </xsl:if> </xsl:for-each> </xsl:variable> <xsl:template match="@* | node()" name="identity"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy> </xsl:template> <xsl:template match="records"> <records> <xsl:apply-templates select="record"/> </records> </xsl:template> <xsl:template match="record"> <xsl:choose> <xsl:when test="not(@type='normal')"> <xsl:call-template name="identity"/> </xsl:when> <xsl:otherwise> <xsl:variable name="vPos" select="position()"/> <xsl:variable name="vposNext" select="substring-before( substring-after($vposArray, concat('|', position(), '|' ) ), '|' )"/> <xsl:variable name="vNumNested" select="$vposNext - position() - 1"/> <xsl:copy> <xsl:copy-of select="@* | node()"/> <xsl:if test="$vNumNested > 0"> <xsl:copy-of select= "following-sibling::record [position() <= $vNumNested]"/> </xsl:if> </xsl:copy> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="record[not(@type='normal')]"/> </xsl:stylesheet> I also measured the time it takes with the three different algorithms (recursive, with keys, non-recursive) with two different sorce xml documents -- one having 1000 consecutive record nodes with type="continuation", the other with 2000 such nodes. They look like this: <records> <record n="1" type="normal"/> <record n="2" type="normal"/> <record n="3" type="continuation"/> <record n="4" type="continuation"/> <record n="5" type="continuation"/> <record n="6" type="continuation"/> <record n="7" type="continuation"/> .............................................................. <record n="1000" type="continuation"/> <record n="1001" type="continuation"/> <record n="1002" type="continuation"/> <record n="1003" type="normal"/> <record n="1004" type="normal"/> </records> and <records> <record n="1" type="normal"/> <record n="2" type="normal"/> <record n="3" type="continuation"/> <record n="4" type="continuation"/> <record n="5" type="continuation"/> <record n="6" type="continuation"/> <record n="7" type="continuation"/> .............................................................. <record n="2000" type="continuation"/> <record n="2001" type="continuation"/> <record n="2002" type="continuation"/> <record n="2003" type="normal"/> <record n="2004" type="normal"/> </records> I tested the following XSLT processors: MSXML4, Saxon6.5.3, JD1.5.2, XalanJ2.4.1, XalanC1.5 I also wanted to test a forth algorithm using generate-id(). It took MSXML4 233157 milliseconds on the 1000 nodes document and I had to kill it after 30 minutes with the 2000 nodes document. I did not attempt to test this with the rest of the XSLT processors. Here are the results: Method MSXML4 Saxon JD XalanJ XalanC ============================================================= Recursive: 1000 nodes 37 Stack Ovfl. 150 Stack Ovfl. 170 2000 nodes 519 Stack Ovfl. Stack Ovfl. Stack Ovfl. 621 Keys: 1000 nodes 44 1783 791 4677 1472 2000 nodes 2211 6890 2654 17466 5818 Non-Recursive: 1000 nodes 5.9 50 100 1152 10 2000 nodes 12.09 101 120 1342 40 The non-recursive algorithm exhibits linear behaviour with MSXML4 and Saxon and sub-linear! one with JD, XalanJ and XalanC. The recursive and key-based algorithm exhibit quadratic behaviour. XalanJ is clearly not in the same company as the rest of the tested processors. I could try still speeding up the non-recursive algorithm, by using a faster search than linear to find the position of a record node in the string with positions -- this will require that all positions must have the same (some maximum) length. Or I could record the positions in a node-set, for which binary search is straight-forward. In case you are still not satisfied with the speed of the non-recursive algorithm, just let me know :o) Hope this helped. ===== Cheers, Dimitre Novatchev. http://fxsl.sourceforge.net/ -- the home of FXSL XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
| Current Thread | 
|---|
| 
 | 
| <- Previous | Index | Next -> | 
|---|---|---|
| RE: [xsl] Re: Re: Re: Unbounded ele, Gupta, Raman K [CI] | Thread | [xsl] Re: Re: Re: Re: Unbounded ele, Dimitre Novatchev | 
| Re[2]: [xsl] How do you un-escape i, Arthur Maloney | Date | RE: [xsl] how to estimate speed of , Michael Kay | 
| Month |