Subject: RE: [xsl] Finding unique nodes in a non-sibling nodeset From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx> Date: Sun, 30 Jun 2002 20:14:45 +0100 |
> In a code generation transform that I am working on, I > frequently encounter situations where I need to eliminate > duplicate expressions or event calls. The nodes with the > commonality to be detected are often scattered around > different parts of a large (preprocessed) reference document > that is loaded with a document call. > > Previously, I had eliminated duplicates with something of the > form $list[not(@key1=preceding-sibling::*/@key1)] > or > $list[not(@key1=preceding::*/@key1)] > ... If I wanted to look back through the whole document. > > In this situation however, the nodes to be duplicate-trimmed are > > [A] Selected out of the reference document in very specific contextual > ways (e.g. deep inside xsl:template / xsl:for-each usages) > [B] Not all sibling nodes [C] The preceding axis can't be > used since it looks at the whole > preceding area of the document, not just my carefully > selected nodes. [D] The definition of duplication requires > use of multiple node > attributes. i.e. needs a composite key. > > Even if [D] were not true, the "preceding-sibling" axis > approach would not work because of [B] and the "preceding" > axis approach would not work because of [C]. Muenchian grouping should be able to cope with this, provided (a) all the nodes are in the same document, and (b) you can code the rules for "carefully selecting" the nodes in a match pattern. You can handle composite keys using concatenation. Where these conditions aren't true, the usual approach is to build a temporary tree containing copies of the selected nodes. You can then use Muenchian grouping on this tree, accessing it using the xx:node-set() extension function. > > I eventually hit on a way to solve this (since I use Saxon) > using saxon:tokenize. But I always wondered if there was a > non-extension way to do it. > > What I did was build an aggregate string with delimiters from > the nodes in the set in question (in a variable called > "$list"), like so ... > > <xsl:variable name="aggregate"> > <xsl:for-each select="$list"> > <xsl:value-of select="concat(@key1,'/',@key2)" /> > <xsl:if > test="not(position()=last())"><xsl:text>#</xsl:text></xsl:if> > </xsl:for-each> > </xsl:variable> > > Then use tokenize to get a node set ... > > <xsl:variable name="list4" select="saxon:tokenize($aggregate,'#')"/> > > And eliminate the duplicates the standard (?) way with > > <xsl:variable name="list4NoDups" > select="$list4[not(.=preceding-sibling::*)]"/> Innovative, but as you say, if you're going to use extensions, saxon:distinct() does the job more directly. > There are features in Saxon 7.1 that we are very interested > in, so I needed to try to find a different technique. > XPath 2.0 offers a distinct-values() function, but it's not yet available in Saxon. What you can use, however, is <xsl:for-each-group>. I think this should solve your problem fairly directly. <xsl:for-each-group select="$list" group-by="concat(@key1, '/', @key2)"> ... This will iterate once for each distinct value of the group-by key, with the context node being the first node in $list that has that key value. Michael Kay Software AG home: Michael.H.Kay@xxxxxxxxxxxx work: Michael.Kay@xxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Finding unique nodes in a non, Mike Berrow | Thread | [xsl] Please help - Saxon Null poin, Vishal Bhasin |
Re: [xsl] remove extra chars from t, James Fuller | Date | RE: [xsl] How parser maintains and , Michael Kay |
Month |