Re: [xsl] How to efficiently obtain the first 10 records of a file with over 2 million records?

Subject: Re: [xsl] How to efficiently obtain the first 10 records of a file with over 2 million records?
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 19 Jul 2023 16:25:57 -0000
On 19.07.2023 18:09, David Carlisle d.p.carlisle@xxxxxxxxx wrote:
>
>
> On Wed, 19 Jul 2023 at 16:15, Roger L Costello costello@xxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>     Hi Folks,
>
>     I have an XML file containing over 2 million <record> elements. I
>     want to obtain the first 10 <record> elements.
>
>     Here's how I did it:
>
>     <xsl:for-each select="/Document/record[position() le 10]">
>     B  B  <xsl:sequence select="."/>
>     </xsl:for-each>
>
>     I ran it and it took a long time to complete. I am guessing that
>     the XSLT processor is iterating over all 2 million <record>
>     elements. Yes?B  How to write the XSLT code so that the XSLT
>     processor stops iterating upon processing the first 10 <record>
>     elements?
>
>     /Roger
>
>
>
> You may have access to a streaming processor to avoid parsing the
> whole file, but an alternative is to invoke your inner DPH


If you don't have access to Saxon EE then instead of invoking your DPH
you can use STX https://sourceforge.net/projects/joost/: java -jar
joost.jar -o results.xml records.xml transform.stx


<stx:transform version="1.0"
 B  xmlns:stx="http://stx.sourceforge.net/2002/ns";>

<stx:template match="Document">
 B  <stx:process-children/>
</stx:template>

<stx:template match="Document/record[1]">
 B  <stx:process-self group="copy"/>
 B  <stx:process-siblings group="copy" until="record[11]"/>
</stx:template>

<stx:group name="copy">
 B  <stx:template match="node()">
 B B B  <stx:copy>
 B B B B B  <stx:process-children/>
 B B B  </stx:copy>
 B  </stx:template>
</stx:group>

</stx:transform>

Current Thread