Subject: RE: [xsl] Does <xsl:copy> use a lot of memory? Is there an alternative that is more efficient? From: "Costello, Roger L." <costello@xxxxxxxxx> Date: Mon, 3 Sep 2012 13:57:13 +0000 |
Michael Kay wrote: > In Saxon, and I suspect in most processors, no memory is used for the > result tree provided that the transformation is writing directly to a > serializer. So if I use xsl:copy and the results of the copy are immediately output, then there is little or no memory consumption. Yes? However, in my situation I need to store the results of xsl:copy into a variable. Then I process the variable. That processing also uses xsl:copy. I put those results into another variable. And again and again. So in my situation is xsl:copy consuming lots of memory? In other words, if I don't immediately output the results of xsl:copy, then memory consumption grows and grows. Yes? /Roger -----Original Message----- From: Michael Kay [mailto:mike@xxxxxxxxxxxx] Sent: Sunday, September 02, 2012 10:31 AM To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: [xsl] Does <xsl:copy> use a lot of memory? Is there an alternative that is more efficient? Memory is used for the source document and for intermediate variables. In Saxon, and I suspect in most processors, no memory is used for the result tree provided that the transformation is writing directly to a serializer. Intrinsically, all xsl:copy has to do is to send two events - startElement and endElement - to the serializer. I would strongly suspect that the out of memory error occurs during building of the source tree, and will happen whatever transformation you run. For a 370Mb input document, you should probably allocate at least 2Gb of memory, preferably more. Michael Kay Saxonica On 02/09/2012 13:47, Costello, Roger L. wrote: > Hi Folks, > > Does <xsl:copy> use a lot of memory? > > Is there an alternative that is more efficient? > > Consider this problem. I have an XML document in which some elements have an id attribute and others have an idref attribute. If an element A references element B, then I want to embed B inside A. > > Example: I want to convert this: > > <Test> > <A idref="b" /> > <B id="b" /> > </Test> > > to this: > > <Test> > <A> > <B id="b" /> > </A> > <B id="b" /> > </Test> > > Notice that A references B, and after processing B is nested inside A. > > Here's a template that handles elements with a reference: > > <xsl:key name="ids" match="*[@id]" use="@id"/> > > <xsl:template match="*[@idref]"> > > <xsl:variable name="refed-element" select="key('ids', @idref)"/> > > <xsl:copy> > <xsl:copy-of select="@* except @idref" /> > <xsl:sequence select="$refed-element" /> > </xsl:copy> > > </xsl:template> > > The complete program is below. > > It works fine if: > > (a) The XML document is small. > (b) I don't have to repeat this embedding process too many times. > > However, such is not the case. I am dealing with an XML document that is 370 MB in size and has tens of thousands of references. And I have to repeat the embedding process multiple times. > > Saxon gives me an "out of memory error." > > I suspect the reason for this is due to the <xsl:copy> command. I believe it is making new copies, thereby consuming lots of memory. True? > > So, is there an alternative to <xsl:copy> that is more efficient? > > Is there a way to express the above template rule that is more efficient? > > /Roger > ----------------------------------------------------------------------------- ------------ > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > exclude-result-prefixes="#all" > version="2.0"> > > <xsl:output method="xml" /> > > <xsl:key name="ids" match="*[@id]" use="@id"/> > > <xsl:template match="*[@idref]"> > > <xsl:variable name="refed-element" select="key('ids', @idref)"/> > > <xsl:copy> > <xsl:copy-of select="@* except @idref" /> > <xsl:sequence select="$refed-element" /> > </xsl:copy> > > </xsl:template> > > > <xsl:template match="node()"> > > <xsl:copy> > <xsl:copy-of select="@*"/> > <xsl:apply-templates /> > </xsl:copy> > > </xsl:template> > > </xsl:stylesheet>
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Does <xsl:copy> use a lot, Michael Kay | Thread | Re: [xsl] Does <xsl:copy> use a lot, David Carlisle |
Re: [xsl] Does <xsl:copy> use a lot, Michael Kay | Date | Re: [xsl] Does <xsl:copy> use a lot, David Carlisle |
Month |