Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?

Subject: Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 2 May 2024 13:38:57 -0000
On 02/05/2024 15:35, Martin Honnen martin.honnen@xxxxxx wrote:

Saxon EE is the only XSLT 3 processor implementing streaming so there you could try


<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; B version="3.0" B xmlns:xs="http://www.w3.org/2001/XMLSchema"; B exclude-result-prefixes="#all" B expand-text="yes">

B <xsl:param name="search-term" as="xs:string" select="'DNKK'"/>

B <xsl:output indent="yes"/>

B  <xsl:mode streamable="yes" on-no-match="shallow-skip"
use-accumulators="#all"/>

B  <xsl:accumulator name="string-value" as="xs:string?"
initial-value="()" streamable="yes">
B B B  <xsl:accumulator-rule match="*" select="()"/>
B B B  <xsl:accumulator-rule match="text()" select="$value || ."/>
B  </xsl:accumulator>

B  <xsl:template match="*">
B B B  <xsl:apply-templates/>
B B B  <xsl:variable name="string-value"
select="accumulator-after('string-value')"/>
B B B  <xsl:if test="not(empty($string-value)) and $string-value =
$search-term">
B B B B B B  <result>
B B B B B B B B  <xsl:copy>{$string-value}</xsl:copy>
B B B B B B B B  <parent>{node-name(..)}</parent>
B B B B B B  </result>
B B B  </xsl:if>
B  </xsl:template>

Changing the accumulator and the template to


B <xsl:accumulator name="string-value" as="xs:string?" initial-value="()" streamable="yes"> B B B <xsl:accumulator-rule match="*" select="()"/> B B B <xsl:accumulator-rule match="text()" select="$value || ."/> B B B <xsl:accumulator-rule phase="end" match="*" select="if (not(empty($value)) and $value = $search-term) then $value else ()"/> B </xsl:accumulator>

B  <xsl:template match="*">
B B B  <xsl:apply-templates/>
B B B  <xsl:variable name="string-value"
select="accumulator-after('string-value')"/>
B B B  <xsl:if test="not(empty($string-value))">
B B B B B B  <result>
B B B B B B B B  <xsl:copy>{$string-value}</xsl:copy>
B B B B B B B B  <parent>{node-name(..)}</parent>
B B B B B B  </result>
B B B  </xsl:if>
B  </xsl:template>

might consume less memory.

Current Thread