Re: [xsl] To determine the distinct elements in a sequence of 46,656 elements takes 5 hours of XSLT processing

Subject: Re: [xsl] To determine the distinct elements in a sequence of 46,656 elements takes 5 hours of XSLT processing
From: Graydon <graydon@xxxxxxxxx>
Date: Sun, 19 Aug 2012 19:05:55 -0400
On Sun, Aug 19, 2012 at 10:43:35PM +0000, Costello, Roger L. scripsit:
> Hi Folks,
> 
> I have a sequence of 46,656 elements that I call "maps."
> 
> Here is one map:
> 
>     <map id="Planes-Enroute-to-Airports">
>         <singletonMap from="F16" to="DFW"/>
>         <singletonMap from="B707" to="ORD"/>
>         <singletonMap from="F35" to="MIA"/>
>         <singletonMap from="S340" to="LAX"/>
>         <singletonMap from="A320" to="SFO"/>
>         <singletonMap from="MD90" to="DEN"/>
>     </map>
> 
> I wrote a function to return all of the distinct maps.
> 
> Unfortunately it takes about 5 hours of XSLT processing.
> 
> Perhaps my XSLT program is inefficient.

This is normally the sort of thing where keys are considered a good approach, but I have to say that if the requirement is to produce an output file with only the unique map elements in it (and I have at all understood what you're trying to do), I'd go with:

<xsl:template match="node()|@*">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="map[not(deep-equal(.,/maps/map[. &lt;&lt; current()]))]">
    <!-- if no map in front of me in document order is deep-equal to me, 
         keep a copy of me in the output -->
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="map[deep-equal(.,/maps/map[. &lt;&lt; current()])]">
    <!-- if some map element before me in document order is deep equal to me,
         forget I ever existed. -->
</xsl:template>

I'd expect this to be more efficient than doing any kind of binary partitioning, but as always in questions of efficiency, testing gives the only real answer.

-- Graydon

Current Thread