Subject: [xsl] Different performance of nodesets created in different ways From: "TAYLOR Peter J \(AXA-I\)" <Peter.J.Taylor@xxxxxxxxxxxxxxxxxxx> Date: Fri, 1 Feb 2008 10:29:38 -0000 |
We recently experienced an out of memory error with an xslt 1.0 stylesheet which used the xalan nodeset() function to convert an <xsl:variable> with a non-empty body from a result tree fragment into a nodeset. My test case data looks like this - <a> <b>g-day<c>hello</c><c>hello</c><c>hello</c></b> <b>g-day<c>hello</c><c>hello</c><c>hello</c></b> ... several thousand more <b>...</b> elements like this ... <b>g-day<c>hello</c><c>hello</c><c>hello</c></b> <b>g-day<c>hello</c><c>hello</c><c>hello</c></b> </a> My (deeply flawed) test-case stylesheet originally looked like this - <?xml version='1.0'?> <xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:call-template name="bigMemoryUsage"/> </xsl:template> <xsl:template name="bigMemoryUsage"> <xsl:variable name="big"> // result tree fragment <xsl:copy-of select="/a/b"/> </xsl:variable> <xsl:for-each select="/a/b"> <xsl:variable name="i" select="position()"/> <xsl:value-of select="xalan:nodeset($big)/b[position()=$i]"/> </xsl:for-each> </xsl:template> </xsl:stylesheet> When I changed the <xsl:variable> to get its value from the select="..." attribute, i.e. to be of type node-set, and removed the call to xalan:nodeset() - <?xml version='1.0'?> <xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:call-template name="smallMemoryUsage"/> </xsl:template> <xsl:template name="smallMemoryUsage"> <xsl:variable name="small" select="/a/b"/> // nodeset <xsl:for-each select="/a/b"> <xsl:variable name="i" select="position()"/> <xsl:value-of select="$small/b[position()=$i]"/> </xsl:for-each> </xsl:template> </xsl:stylesheet> my test case used a quarter as much memory. The two versions of the stylesheet process the same nodes (or a copy of the same nodes) and produce the same output. Unfortunately the "small memory" version of the stylesheet ran for four times as long as the "big memory" version. When I experimentally changed the axis in the <xsl:value-of> from child to descendant - <?xml version='1.0'?> <xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:call-template name="smallMemoryUsage"/> </xsl:template> <xsl:template name="smallMemoryUsage"> <xsl:variable name="small" select="/a/b"/> <xsl:for-each select="/a/b"> <xsl:variable name="i" select="position()"/> <xsl:value-of select="$small//b[position()=$i]"/> // descendant axis </xsl:for-each> </xsl:template> </xsl:stylesheet> the "small memory" stylesheet took 5 times as long again to run. However, when I made the corresponding change to the "big memory" stylesheet - <?xml version='1.0'?> <xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:call-template name="bigMemoryUsage"/> </xsl:template> <xsl:template name="bigMemoryUsage"> <xsl:variable name="big"> <xsl:copy-of select="/a/b"/> </xsl:variable> <xsl:for-each select="/a/b"> <xsl:variable name="i" select="position()"/> <xsl:value-of select="xalan:nodeset($big)//b[position()=$i]"/> // descendant axis </xsl:for-each> </xsl:template> </xsl:stylesheet> the "big memory" stylesheet ran in about the same time as before. I then rewrote the "small memory" stylesheet like this - <?xml version='1.0'?> <xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:call-template name="smallMemoryUsage"/> </xsl:template> <xsl:template name="smallMemoryUsage"> <xsl:variable name="small" select="/a/b"/> <xsl:for-each select="$small/b"> <xsl:value-of select="."/> </xsl:for-each> </xsl:template> </xsl:stylesheet> Having got rid of the silly position() predicate, the performance of the "small memory" stylesheet was about 100 times better. Making the same change to the "big memory" stylesheet improved performance by about 50 times. This presumably reflects the extra cost of using the Xalan nodeset() function. I am now happy with the performance of the "small memory" stylesheet when it is written sensibly, but I do not really understand why doing silly processing against a nodeset created from a call to xalan:nodeset() seems to run about 20 times quicker than the same silly processing against a nodeset variable created by the select="..." attribute of <xsl:variable>. Is it something to do with whether my nodeset variable uses a NodeIterator, Nodelist or NodeVector under the bonnet? Am I accessing the nodes sequentially in one case and positionally in the other? Am I missing something fundamental about how predicates work? I have been running the above through Stylus Studio using Xalan 2.7.0 and through an IBM java 1.5 jvm also using Xalan 2.7.0. My version of XSLT is 1.0. I've allocated my jvm between 64mb and 500mb of memory at various stages of testing, and the production IBM java 1.5 jvm which blew had 1.5 gb, and was running java code compiled at 1.4.2 . Any help would be greatly appreciated! Pete Taylor _________________________________________________ AXA UK IT Pete Taylor IT Solution Consultant AXA, Ballam Road (ABC Block), Lytham, FY8 4TQ Tel: +44 (0)1253 683398 (internal - 741 3398) E-mail: peter.j.taylor@xxxxxxxxxxxxxxxxxxx Make tea, not war. _________________________________________________ This email originates from AXA Services Limited (reg. no. 446043) which is a service company for AXA UK plc (reg. no. 2937724) and the following companies within the AXA UK plc Group: AXA Insurance Plc (reg. no. 932111) AXA Insurance UK Plc (reg. no. 78950) AXA General Insurance Limited (reg. no. 141885) All of the above mentioned companies are registered in England and have their registered office at 5 Old Broad Street, London EC2N 1AD, England. AXA Insurance UK plc is authorised and regulated by the Financial Services Authority. This message and any files transmitted with it are confidential and intended solely for the individual or entity to whom they are addressed. If you have received this in error, you should not disseminate or copy this email. Please notify the sender immediately and delete this email from your system. Please also note that any opinions presented in this email are solely those of the author and do not necessarily represent those of The AXA UK Plc Group. Email transmission cannot be guaranteed to be secure, or error free as information could be intercepted, corrupted, lost, destroyed, late in arriving or incomplete as a result of the transmission process. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of email transmission. Finally, the recipient should check this email and any attachments for viruses. The AXA UK Plc Group accept no liability for any damage caused by any virus transmitted by this email.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] XSL FO Image alignment pr, Eliot Kimber | Thread | RE: [xsl] Different performance of , Michael Kay |
[xsl] XSL FO Image alignment proble, Vanaja Selvaraj | Date | RE: [xsl] Different performance of , Michael Kay |
Month |