Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?

Subject: Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?
From: "Michael Kay michaelkay90@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 2 May 2024 14:27:25 -0000
> On 2 May 2024, at 14:08, Roger L Costello costello@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi Folks,
>
> I have an XSLT program that locates all leaf elements which have the string
value 'DNKK'. My program outputs the element and the name of its parent:
>
>    <xsl:template match="/">
>        <results>
>            <xsl:for-each select="//*[not(*)][. eq 'DNKK']">
>                <result>
>                    <xsl:sequence select="."/>
>                    <parent><xsl:value-of select="name(..)"/></parent>
>                </result>
>            </xsl:for-each>
>        </results>
>    </xsl:template>
>
> The input XML document is large, nearly 5GB.
>

Well, it's almost streamable as written.

Unless there are comments and processing instructions to worry about, I think
the following code is equivalent, and streamable:

>  <xsl:template match="/">
>        <results>
>            <xsl:for-each select="//text()[. eq 'DNKK']">
>                <result>
>                    <xsl:element name="name(..)">DNKK</xsl:element>
>                    <parent><xsl:value-of select="name(../..)"/></parent>
>                </result>
>            </xsl:for-each>
>        </results>
>    </xsl:template>

Might have to do some fine tuning if there are namespaces involved.

Michael Kay
Saxonica

Current Thread