Subject: Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string? From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 2 May 2024 13:35:22 -0000 |
Hi Folks,value 'DNKK'. My program outputs the element and the name of its parent:
I have an XSLT program that locates all leaf elements which have the string
below.
<xsl:template match="/"> <results> <xsl:for-each select="//*[not(*)][. eq 'DNKK']"> <result> <xsl:sequence select="."/> <parent><xsl:value-of select="name(..)"/></parent> </result> </xsl:for-each> </results> </xsl:template>
The input XML document is large, nearly 5GB.
When I run my program SAXON throws the OutOfMemoryError message shown
To solve the OutOfMemoryError I could add to my heap space (-Xmx) when I
invoke Java. But I wonder if there a way to write my program so that it is more efficient (i.e., doesn't require so much memory)?
Saxon EE is the only XSLT 3 processor implementing streaming so there you could try
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" B version="3.0" B xmlns:xs="http://www.w3.org/2001/XMLSchema" B exclude-result-prefixes="#all" B expand-text="yes">
B <xsl:mode streamable="yes" on-no-match="shallow-skip" use-accumulators="#all"/>
B <xsl:accumulator name="string-value" as="xs:string?" initial-value="()" streamable="yes"> B B B <xsl:accumulator-rule match="*" select="()"/> B B B <xsl:accumulator-rule match="text()" select="$value || ."/> B </xsl:accumulator>
B <xsl:template match="*"> B B B <xsl:apply-templates/> B B B <xsl:variable name="string-value" select="accumulator-after('string-value')"/> B B B <xsl:if test="not(empty($string-value)) and $string-value = $search-term"> B B B B B B <result> B B B B B B B B <xsl:copy>{$string-value}</xsl:copy> B B B B B B B B <parent>{node-name(..)}</parent> B B B B B B </result> B B B </xsl:if> B </xsl:template>
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] An efficient XSLT program, Michael Kay michaelk | Thread | Re: [xsl] An efficient XSLT program, Martin Honnen martin |
Re: [xsl] An efficient XSLT program, Martin Honnen martin | Date | Re: [xsl] An efficient XSLT program, Martin Honnen martin |
Month |