Re: O(n) notation (and character padding)

Subject: Re: O(n) notation (and character padding)
From: David Carlisle <davidc@xxxxxxxxx>
Date: Fri, 17 Nov 2000 11:19:17 GMT

> You both recommend looking at a book on algorithms: do you have any
> good ones that you recommend particularly?

Knuth is the bible, (Art of Computer Programming) but I'm not sure you
need a bible. It's been a while since I worked in a CS department.
Someone must be able to suggest a good entry level text book (or,
cheaper, web site).


> is still 0(n^2) is a comment about how the run-time of the XPath will
> increase as more nums are added: it doesn't matter how much time it
> takes or the fact it might stop half way through.

Yes O(n^2) says that after initial startup costs the algorithm
will take 10000 times longer to process input that is 100 times bigger. 
It doesn't say anything about how big that startup cost is, or how much
time each "step" takes. 

So if you have a simple algorithm for the same problem that is O(n^3)
that doesn't require setting up any initial stuff and runs 1000 times
faster (perhaps because it fits in memory rather than swapping to disk)
then it may be that on all your real problems this turns out to be
quicker than the O(n^2) algorithm, but in the end for big enough
problems the fact that increasing the input by a factor of 100 means it
takes 1000000 times as long to finish, means that at some point the
O(n^2) algorithm is going to win. If you think O(n^3)
grows fast, compare an exponential algorithm like factoring: You don't
even want to _think_ about how big 10^n gets if you change n by a factor
of 100.


>  I'm still struggling to see why:

well in the first one you first get all the elements bigger than
the current element, then you recurse. Clearly this gets there in the
end but

consider the following list with 1 being current node.

1 2 3 4 5 ....n

step 1:
compare 2 < 1, 3< 1, .... n < 1
construct list 
 2,3,4,,5....n

that's n steps done, and we have a list of length n-1
(just counting number of comparisons, ignoring any time taken to
construct node lists, bind variables, etc.

now recurse:
  step 1
   compare 
        3 > 2 , ..... n > 2
   construct list
        3, .....n

   that's another n -1 steps and we have a list of n-2 still to process.


Clearly this process is going to take n recursions, each recursion
will take 1 less step so in toto

n + (n-1) + (n-2) + ......2 + 1

which is (n^2 + n)/2

n=1  (1^2 + 1)/2 = 1 = 1
n=2  (2^2 + 2)/2 = 3 = 2 + 1
n=3  (3^2 + 3)/2 = 6 = 3 + 2 + 1
etc

now  (n^2 + n)/2 is O(n^2) because we don't care about the /2
as that means it's "twice as fast" but we never specified the time units
anyway, and we don't care about the + n because for large n, n is nothing
compared to n^2.


The other algorithm was

step 1
find maximum of rest of list (by recursion)
step 2
compare that with current element and use larger of the two.

This time, there is again a recursion of depth n, but each recursion
just uses 1 (or 2, depending how you count) steps.
So the total number of steps is

1 + 1 + ....  +1 n times. which is n which is clearly O(n).


Its probably worth noting that your first algorithm has optimally bad
behaviour in the case the list is already sorted. If the input had been
sorted in the other order, it would of course have stopped a lot quicker
as the list constructed in the first recursion would have been empty.
The "basic" notion of complexity studies worst case behaviour.
There is a lot that could be said about algorithms that work well on
"typical" input, but that;s a different subject.

Decoding unix crypt encoded passwords has a complexity that should mean
that the passwords are secure. But there is another algorithm that has
_constant_ O(1) time that works quite well in practice.
Get a dictionary of common names and commonly used passwords
and try each of those. This algorithm (for a given dictionary)
doesn't depend on the size of the input password.
This kind of trick lookup is used in practice, compare Mike's
description of xsl:number in saxon. The "obvious" thing to do is count
whatever you are supposed to be counting, but in his case the system
just has the answer to hand, so long as you do the "common" case
so it takes effectively no time at all.


David

_____________________________________________________________________
This message has been checked for all known viruses by Star Internet delivered
through the MessageLabs Virus Control Centre. For further information visit
http://www.star.net.uk/stats.asp


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread