Re: [xsl] [Summary] Three ways to express in XPath that there are no duplicates in a list of items

Subject: Re: [xsl] [Summary] Three ways to express in XPath that there are no duplicates in a list of items
From: Liam R E Quin <liam@xxxxxx>
Date: Fri, 02 Nov 2012 10:33:39 +0100
On Fri, 2012-11-02 at 09:15 +0000, Costello, Roger L. wrote:
[...]
> Here's how to implement it in XPath 1.0 and in XPath 2.0.
> 
> XPath 1.0:
> 
> 	not(Websites/*[. = preceding-sibling::*])
> 
> XPath 2.0:
> 
> 	empty(Websites/*[index-of(../*,.)[2]])
> 	count(Websites/*) = count(distinct-values(Websites/*))
> 
> The preferred XPath is the last one because it has the best
> performance. The first two take on the order of n-squared time (where
> n is the number of websites in the list) whereas the last XPath
> expression takes on the order of n log n time.

Some brief comments on this...
(1) the XPath 1 solution also works in later versions of XPath;
(2) this only works to test simple text content
(3) an XPath 2 (or later) processor can rewrite the first or second
expression if it likes (and is smart enough) into the third
(4) the last expression can be implemented in linear time, not n log n,
because it's not not necessary to sort values to see if they are
distinct (e.g. the implementation can use a hash table), although this
does use more memory. But for simple content the hash table approach
doesn't even need to use more memory, can just point into the document
tree.

So statements about performance need to be with respect to a particular
version of a specific product, in a specific environment.

And,
(5) I'm sure these aren't the only ways to see if all items are
distinct :-)

You can start getting pretty fancy,
some $w in Websites/Website satisfies count(Websites/Website[. eq $w])
gt 1,
or flwor expressions in XQuery,
or use a key or (XPath in XSLT 3) a map, or.... :)

Liam


-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org freenode/#xml

Current Thread