Re: [xsl] How to use xsl:key to make my XSLT program super-efficient?

Subject: Re: [xsl] How to use xsl:key to make my XSLT program super-efficient?
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 12 Mar 2025 12:01:02 -0000
> Am 12.03.2025 um 12:51 schrieb Roger L Costello costello@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>:
>
> Hi Folks,
>
> I have a huge (1.2GB) XML document containing air navigation data. I need to
convert the XML document to another XML format, i.e., it is an XML-to-XML
translation task.
>
> Deep within the source XML document is an <ARPT> element that contains <row>
elements, one for each airport in the world:
>
> <Air_Navigation>
>    ...
>    <ARPT>
>         <row>
>             airport 1 data (bunch of child elements)
>        </row>
>         <row>
>             airport 2 data (bunch of child elements)
>        </row>
>        ...
>    </ARPT>
>    ...
> </Air_Navigation>
>
> My XSLT needs to be super-efficient because the XML document is huge, and I
need to perform a lot of processing.
>
> When I think of "efficiency" what comes to mind is the xsl:key/key() pair.
>
> I need to iterate over each airport record, i.e., iterate over each <row>
element within the <ARPT> element. I figured that the following xsl:key/key()
pair will enable me to iterate efficiently:
>
> <xsl:key name="airports" match="ARPT/row" use="''"/>
>
> <xsl:for-each select="key('airports','')"> ... </xsl:for-each>
>
> Is that correct? Is that efficient? Will the loop execute super-fast?
>
> Truthfully, I am a bit confused about the "use" attribute on the <xsl:key>
element. As I understand it, the <xsl:key> will gather up all the <row>
elements that are within <ARPT> and then the "use" attribute may be used to
indicate how to select a subset of the <row> elements; is that correct? But I
have no desire to select a subset, I want to iterate over all the elements
gathered up by <xsl:key>. Notice that I specified the empty string for the
value of the "use" attribute, and I specified the empty string as the second
argument of key(). Is that the right thing to do, given that I have no desire
to select a subset of <row> elements?
>
>

To me that doesnbt make much sense, if you just use a gobal variable
selecting the ARPT/row elements, that should suffice, I would think, but I
havenbt measured. As for performance, for xsl:for-each with Saxon EE you can
use saxon:threads for multithreading:
https://www.saxonica.com/html/documentation12/extensions/attributes/threads.h
tml

Current Thread