[xsl] XSLT2, collection(), and xsl:key

Subject: [xsl] XSLT2, collection(), and xsl:key
From: "James Cummings" <cummings.james@xxxxxxxxx>
Date: Fri, 1 Feb 2008 17:23:23 +0000
Hiya,

I'm using the collection() function and Saxon to produce some
statistics about how many of which elements of which type in a
particular set of documents.

Let's say that document one has something like:

<p xml:id="doc1" type="hypothetical">
There is some text with <seg type="foo">some foo</seg> and
occasionally <seg type="blort">blort</seg> and <other
type="wibble">wibble</other></p>


and document two (and up to some really large number) is like:

<p xml:id="doc2">
There is another doc with <seg type="foo">some foo</seg> and
occasionally <seg type="notBlort">notBlort</seg> and <other
type="fluffy">fluffy other</other> and <some
  name="thing">someThing</some></p>

What I want to produce are tables of counts of specific elements, by
document and type. So something like the following (though using
table/row/cell xml markup):


table: other
document | fluffy | wibble | stuff
doc1 | 0 | 1 | 0
doc2 | 1 | 0 | 0
doc3 | 20 | 12 | 54

table: seg
document | blort | foo | notBlort
doc1 | 1 | 1 | 0
doc2 | 0 | 1| 1
doc3 | 23 | 44 | 58

table: some
document | thing | else | now
doc1 | 0 | 0 | 0
doc2 | 1 | 0 | 0
doc3 | 12 | 5 | 24

I can build this manually (and for one element I have done so) by doing:

<xsl:variable name="docs" select="collection('../../working/xml/docs.xml')"/>
<xsl:template name="main">
<table><head>seg by type</head>
<row rend="label">
<cell>document</cell>
<cell>blort</cell>
<cell>foo</cell>
<cell>notBlort</cell>
</row>
<xsl:for-each select="$docs//p"> <!-- let's pretend p is the root element -->
<row>
<xsl:variable name="doc" select="@xml:id"/>
<cell><xsl:value-of select="$doc"/></cell>
<cell><xsl:value-of select="count(.//seg[@type='blort'])</cell>
<cell><xsl:value-of select="count(.//seg[@type='foo'])</cell>
<cell><xsl:value-of select="count(.//seg[@type='notBlort'])</cell>
</row>
</xsl:for-each>
</table>
</xsl:template>

But that isn't really the point now is it?  I tried to use <xsl:key>
but I ran into the problem of it not liking the collection() function
as part of the match.

What I want to do is be able to say for-each doc, build me a table of
all the (let's pretend unknown) values of this attribute on this
element.  So something like:

<xsl:for-each select="$docs//p">
<xsl:value-of select="my:function(other/@type, seg/@type, thing/@name,
new/@type)"/>
</xsl:for-each>

and without knowing the values of @type in advance it makes a table
like above of them (using distinct-values()?) and counting their
occurrences.

This is a case where I know it must be possible, and I could just go
and do it manually, (in reality there are about 10 elements with a
number of attributes, with around 20 values each), but it just seems
*wrong* to do it that way. ;-)

Suggestions?

Thanks,

-James

Current Thread