Re: [xsl] How to efficiently obtain the first 10 records of a file with over 2 million records?

Subject: Re: [xsl] How to efficiently obtain the first 10 records of a file with over 2 million records?
From: "David Carlisle d.p.carlisle@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 19 Jul 2023 16:09:03 -0000
On Wed, 19 Jul 2023 at 16:15, Roger L Costello costello@xxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Hi Folks,
>
> I have an XML file containing over 2 million <record> elements. I want to
> obtain the first 10 <record> elements.
>
> Here's how I did it:
>
> <xsl:for-each select="/Document/record[position() le 10]">
>     <xsl:sequence select="."/>
> </xsl:for-each>
>
> I ran it and it took a long time to complete. I am guessing that the XSLT
> processor is iterating over all 2 million <record> elements. Yes?  How to
> write the XSLT code so that the XSLT processor stops iterating upon
> processing the first 10 <record> elements?
>
> /Roger
>
>
I would guess your issue is not the loop (as the system can stop after 10
easily enough) but parsing the initial file.
You may have access to a streaming processor to avoid parsing the whole
file, but an alternative is to invoke your inner DPH

$n=0;
while (<>) {
    $n+=1 if /<record>/;
    print $_  if($n>0);
    last if($n==10 and m/<\/record>/) ;
}

 will output the first 10 records as long as each </record> is on a
separate line.

$ perl rrec.pl rrec.xml
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>
 <record>
  xx
 </record>



https://www.xml.com/axml/notes/OtherGoals.html

Current Thread