Re: [xsl] How many passes through the document

Subject: Re: [xsl] How many passes through the document
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Sun, 23 Sep 2012 08:59:58 +0100
Firstly, it depends on the processor, and secondly it doesn't really matter.

One processor might build indexes for both keys in a single pass of the document, another might make one pass of the document for building each key. Either strategy might be faster. Why do you care which strategy is used - surely it's only the bottom-line performance that matters to you?

There does seem to be an intrinsic inefficiency in that you are evaluating

count(key('elems',name()))

twice for each element: once when building the "counts" index, and once during the apply-templates. But it's not a very big inefficiency.

Rather more importantly, your code is incorrect: because of the way it uses name(), it won't produce sensible answers when applied to a document that binds the same prefix to different namespaces (or has two different default namespaces in different regions of the document). In 2.0, use node-name() instead.

I think you are just trying to output the number of occurrences of the element whose name occurs most frequently. If that's the case, try

<xsl:template match="/">
<xsl:for-each-group select="//*" group-by="node-name()">
<xsl:sort select="count(current-group())" order="descending"/>
<xsl:if test="position()=1"><xsl:value-of select="count(current-group())"/></xsl:if>
</xsl:for-each-group>
</xsl:template>


Michael Kay
Saxonica


On 23/09/2012 01:03, Ihe Onwuka wrote:
Of course if I had written it like this with the variable local
instead of global I wouldn't be asking the question.

So I guess the question is whether the global version entails an extra
pass over the data or any other performance penalty.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xs="http://www.w3.org/2001/XMLSchema";
                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                 version="2.0"
                 exclude-result-prefixes="xs">
    <xsl:output method="text"/>
    <xsl:key name="elems" match="*" use="name()"/>
    <xsl:key name="counts" match="*" use="count(key('elems',name()))"/>

    <xsl:template match="/">
       <xsl:variable name="all" as="xs:integer+">
          <xsl:apply-templates select="*"/>
       </xsl:variable>
       <xsl:sequence select="sum(key('counts',max($all))/count(@*))"/>
    </xsl:template>

    <xsl:template match="*">
       <xsl:sequence select="count(key('elems',name()))"/>
       <xsl:apply-templates/>
    </xsl:template>

<xsl:template match="text()"/>


</xsl:stylesheet>

Current Thread