Re: [xsl] RE: Muenchian technique, was (Keys on multiple element types)

Subject: Re: [xsl] RE: Muenchian technique, was (Keys on multiple element types)
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Tue, 5 Feb 2002 13:26:40 +0000
Hi Dave,

>> The keys don't; using the Muenchian method (which uses keys for
>> efficiency) does. The duplicates are removed by the statement:
>> 
>>   *[generate-id(.) = generate-id(key('rows', name)[1])]
>> 
>> where you select all the elements that are the same element as the
>> element you get when you use the 'rows' key with that element's name
>> (i.e. selects the first element with a particular name in the
>> document).
>
> Mike, your books states that the [1] is redundant for the Muenchian
> technique, yet it keeps getting repeated.
>
> I haven't tested in anything like a test case, just wanted perhaps
> Jeni to confirm its habit, and not disagreement that keeps her using
> it?

I'd never dare to disagree with Mike, especially when he's right ;)

The reason I usually include the [1] when I'm explaining this method
of accessing unique values is that it flows naturally from the test
that you're doing. What you're doing is comparing the context node
with the first node returned by the key. If you translate into the set
logic expression it would be:

  count(. | key('rows', name)[1]) = 1

In comparison, if you took:

  generate-id(.) = generate-id(key('rows', name))

and naively translated it to:

  count(. | key('rows', name)) = 1

you'd get a very different result (it would return true if the context
node was the *only* node in the document with that name).
  
In XPath 2.0 terms the comparison is:

  . == key('rows', name)[1]

And I *think* that if you did:

  . == key('rows', name)

then you would get an error if there was more than one node returned
by the key (but it might be that it's a recoverable error that's
covered by the fallback conversions - I don't find the XPath 2.0 WD
particularly clear on this point).

So in general if you're trying to assess whether two nodes are the
same, it's important to pull out the two nodes individually. The only
reason that you can get away with *not* using the [1] if you're using
the generate-id() method of comparing nodes is because generate-id()
automatically looks at only the first node in the node set.

Thus I'll usually miss [1] out in practice, but when I'm
helping/training/teaching/writing I tend leave [1] in to make what's
happening more explicit.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread