Re: [xsl] Muenchian grouping help - removing 'duplicates' from a nodeset
Subject: Re: [xsl] Muenchian grouping help - removing 'duplicates' from a nodeset|
From: "W. Eliot Kimber" <eliot@xxxxxxxxxx>
Date: Thu, 09 Oct 2003 09:45:32 -0500
I think they way to do this is via Muenchian grouping. I know what I need to
do: group all the <text> elements by their text() content; and select only
the first one in each group. But I've followed the guidelines on Jeni
Tennison's XSLT pages and I can't seem to get my head around how keys
The way to do this is with what I call the "union trick". It took me a
long time to finally figure out what was going on and I realized that my
barrier had been not fully understanding that the "|" operator is a set
union, not a logical OR. [I was trying to understand the code Jenny
Tennison had written to do back-of-the-book index processing for Docbook.]
What you do is get the current node and the first node of the current
nodes' entry in the key table and then construct a set from them using
the union operator ("|"). If the result is a list of length one, then
the two nodes must be the same node because if they were different nodes
you'd get a set of length 2. The key is that sets, by definition, always
contain exactly one copy of each node in the set.
So, given this group spec:
<xsl:key name="text-by-content" match="text" use="normalize-space(.)"
You would do something like this:
normalize-space(.))) = 1]"/>
Follow this from the inside out:
This looks up the key table entry for each term selected by the "//term"
pattern and then selects the first item in that list, that is, the first
instance of a given term value.
This creates a set from the current node and the first node of the key
table entry that contains the current node.
This gets the length of the set.
4. count(...) = 1
This returns true if the length of the set is 1, meaning that the
current <term> node is the first node in its containing key table entry.
This node will be selected and added to the result node list.
You can test the result by doing this:
<xsl:message>[<xsl:value-of select="position(.)"/>] = '<xsl:value-of
When doing this type of grouping work, I find it really useful to create
a "debug" template that just constructs all the different groups and
then reports them--makes it easier to work out the details of the key
specs and lookups. If you're doing sorting, it also makes it easy to
test your collation rules.
W. Eliot Kimber
ISOGEN International, LLC
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list