Subject: Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string? From: "Bauman, Syd s.bauman@xxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 2 May 2024 22:49:52 -0000 |
I do not know any answer to the question, and without the data (and perhaps some further information about your system, like default heap space) I cannot reproduce the problem. But my first instinct, for no intelligent reason whatsoever, is to use template application for flow of control: <xsl:template match="/"> <xsl:apply-templates select="//*[ not(*) ][. eq 'DNKK']"/> </xsl:template> <xsl:template match="*"> <result> <xsl:sequence select="."/> <parent><xsl:value-of select="name(..)"/></parent> </result> </xsl:template> For all I know that is worse, not better; but it is what I would try first. I also might try things like * using <xsl:copy> instead of <xsl:sequence> * putting the parent name on an attribute of <result> instead of as a child element * actually selecting text nodes, rather than elements * learning streaming and using EE (as already suggested) * divide-and-conquer: on a first pass knock out portions of the tree that are irrelevant or divide input file into several smaller pieces ________________________________ > Hi Folks, > > I have an XSLT program that locates all leaf elements which have the string value 'DNKK'. My program outputs the element and the name of its parent: > > <xsl:template match="/"> > <results> > <xsl:for-each select="//*[not(*)][. eq 'DNKK']"> > <result> > <xsl:sequence select="."/> > <parent><xsl:value-of select="name(..)"/></parent> > </result> > </xsl:for-each> > </results> > </xsl:template> > > The input XML document is large, nearly 5GB. > > When I run my program SAXON throws the OutOfMemoryError message shown below. > > To solve the OutOfMemoryError I could add to my heap space (-Xmx) when I invoke Java. But I wonder if there a way to write my program so that it is more efficient (i.e., doesn't require so much memory)? > Can you use Saxon EE so that it is worth pondering XSLT 3 with streaming?
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] An efficient XSLT program, Martin Honnen martin | Thread | Re: [xsl] An efficient XSLT program, Michael Kay michaelk |
Re: [xsl] An efficient XSLT program, Michael Kay michaelk | Date | Re: [xsl] An efficient XSLT program, Dimitre Novatchev dn |
Month |