Re: [xsl] Which is less expensive group by or select distinct-values

Subject: Re: [xsl] Which is less expensive group by or select distinct-values
From: "Tony Graham tgraham@xxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 16 Jul 2016 11:53:50 -0000
On 15/07/2016 20:20, dvint@xxxxxxxxx wrote:
So I have a large document that I need to pull a list of unique values
from a given element. These are taxonomy and term tag values from a 4,000
topic collection of DITA content.

Without knowing how these are implemented, is there something I should be
able to intuit just from the spec? This is some code that I inherited and
it wouldn't have been how I would have attacked the problem:

<xsl:variable name="TermList">
<xsl:value-of select="distinct-values(.//term[not(@keyref)])"
separator=", " />
</xsl:variable>
<data type="topicreport" name="WDTermList">
  <xsl:for-each select="tokenize(normalize-space($TermList), ', ')">
	<xsl:sort select="." />
	<xsl:value-of select="."/>
         <xsl:if test="position() != last()">, </xsl:if>
   </xsl:for-each>
</data>

Another option that you could try to see how it affects memory usage [1]:


<xsl:key name="terms" match="term[empty(@keyref)]" use="true()" />

<data type="topicreport" name="WDTermList">
   <xsl:value-of separator=", ">
       <xsl:perform-sort select="distinct-values(key('terms', true()))">
           <xsl:sort select="." />
       </xsl:perform-sort>
   </xsl:value-of>
</data>

(borrowing the xsl:perform-sort idea from Martin Honnen, and assuming that the context node is the document element.)

Regards,


Tony Graham. -- Senior Architect XML Division Antenna House, Inc. ---- Skerries, Ireland tgraham@xxxxxxxxxxxxx


[1] Unless and until Michael Kay advises that using the xsl:key instead of '//' makes no difference in Saxon.


Current Thread