Re: [xsl] Efficient way to check sequence membership

Subject: Re: [xsl] Efficient way to check sequence membership
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Wed, 2 Mar 2011 19:51:17 -0800
Henry, you are right.

binSearch() is slow because the indexing, as in:

          for $mid in ($pStart + $pEnd) idiv 2,
               $sVal in $pSeq[$mid]

is not O(1) as it is in a true array. Depending on the particular XSLT
processor, this may be even O(N), thus giving a dismal O(N^2) for the
complete search.

I tried to compare a key-based search with your linear search and when
searching in 1000 words the key-based search seems to be about twice
faster on the average (when the strings are in the middle of the
sequence).

Therefore, when searching only in a smal sequence of strings, the
linear search isn't bad at all.


Cheers,
Dimitre



On Wed, Mar 2, 2011 at 3:33 PM, Henry S. Thompson <ht@xxxxxxxxxxxx> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dimitre Novatchev writes:
>
>> One way is to have the strings sorted. Then use binary search -- this
>> is O(log(N)).
>
> Maybe that will have an advantage for much bigger sets, but for my
> test case it was considerably slower -- 18ms net vs. 9ms net for the
> (some $s . . . satisfies) and $w=$stops versions, and 2ms net for the
> best (key-based) one.
>
> I'd be very interested if you could compare binSearch with the
> key-based version I sent on your large dictionary example. . .
>
> ht
> - --
> B  B  B  Henry S. Thompson, School of Informatics, University of Edinburgh
> B  B  B 10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131
650-4440
> B  B  B  B  B  B  B  B Fax: (44) 131 651-1426, e-mail: ht@xxxxxxxxxxxx
> B  B  B  B  B  B  B  B  B  B  B  URL: http://www.ltg.ed.ac.uk/~ht/
> B [mail from me _always_ has a .sig like this -- mail without it is forged
spam]
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.6 (GNU/Linux)
>
> iD8DBQFNbtPMkjnJixAXWBoRAlT4AJ9/c0Thgmzz38+728ZYyGuexwmclACdFiNZ
> c5SlTZQDHau9553VUlABs5k=
> =aNAZ
> -----END PGP SIGNATURE-----
>
>



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.

Current Thread