Re: [xsl] Get the duplicates in a list

Subject: Re: [xsl] Get the duplicates in a list
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 4 Jan 2025 12:01:30 -0000
First, the question asked for a solution, not for a solution that is efficient
or scaleable.

I fear that the discussion may be at cross-purposes because the word
"complexity" has different meanings. In the context of big-O notation,
"complexity" has a meaning very different from its everyday meaning. It
basically means "scaleability" - how does the time taken by the task vary with
the size of the input? It's a rather unfortunate choice of term. In
particular, it has nothing to do with either the human understandability of
the expression, or the sophistication of the internal implementation.

Michael Kay
Saxonica

> On 3 Jan 2025, at 22:13, Bauman, Syd s.bauman@xxxxxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> All well and good, and thank you for that, Liam (Like some others here, I
understand what it means to be O(1) vs O(n) vs O(nB2), but am not much at
figuring it out.)
>
> That said, isnbt this post about efficiency as opposed to complexity? Not
that I think we have a very solid definition of complexity, but run time is
not it. p
>
> >
> > I would hope that the map:merge is ensured to be more efficient than
> > that count($fieldnames[. eq $f]) gt 1
>
> First, Dmitrebs solution is great.
>
> On complexity -
>
> if a map is implemented as a hash table, insertion of a single item is
> usually O(1) (which is why they are popular) although this disregards
> collisions, growing the table, garbage collection, etc.
>
> So inserting n items is O(n).
>
> count($fieldnames) has to look at every item in $fieldnames so is O(n),
> but an implementation of sequences as a finger list might store the
> length at the start, in which case thatbs O(1) too, BUT
> count($fieldnames[some expr]) likely has to evalauate the expression n
> times, once per item, and then do a count, so is O(n).
>
> I played with a solution that put the items inside elements and used
> the preceding:: axis, but that meant for each item you looked at all
> the preceding items, making it n x n, i.e. O(nB2)
>
> Note: ordered maps are likely slower than O(n) for insertion and
> deletion, but with the data structures that have been mentioned, likely
> still O(n), but webll need a way to say where the insertion happens.
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
email <>)

Current Thread