Subject: Re: Aggregate From: David Carlisle <davidc@xxxxxxxxx> Date: Thu, 16 Nov 2000 12:40:29 GMT |
> 1. how can you assess an algorithm to determine its 0(n*)-ness? In general it's hard. But for the kind of simple sorting algorithms shown here it's normally fairly easy, and in any case the complexity of sorting algorithms is fairly extensively studied and so the complexity of just about any algorithm you might want to use is listed in "appropriate" books:-) > OK. I think my problem lies in the fact that I have never understood > what 0(n-squared) and so on actually means. O(n^x) means that the time taken to get the answer is proportional to the x-th power of the size of the input. some algorithms are constant time: returning the first element of a list for example. Although if badly implemented (so it reads the whole list) then it is linear time (O(n)) as you have to make one pass through the list. Finding the maximum element is linear, you _have_ to make a pass through all elements of the list to find the largest, so unlike the first case there isn't an improvement you can make here. This shows that there are two rather different things the complexity of the algorithm used in the implementation and the theoretical "best" complexity possible to solve the problem. Also note that this is essentially "limiting behaviour" for large data sets. If you have a simple algorithm that is O(n^2) (so takes four times as long given twice as much data) this might be faster in practice than an O(n log n) algotithm if the "faster" algorithm requires setting up complicated data structures that for simple cases take longer to set up than to use. so finding the maximum element is at best O(n) You make one pass through the list, each time carrying the largest-so-far with you until you get to the end. That's what the recursive walk through the node set does. Sorting the entire list and then taking the last element depends on the complexity of the sorting algorithm used but it must be slower than linear and should be not as slow as n-squared). This might end up being quicker if the system has some optimisations for xsl:sort that it would lose in the "by hand" recursion. But in the end the better complexity will win. n n log n n * n 1 0 1 2 2 4 10 10 100 100 200 10000 1000 300 1000000 10000 4000 100000000 so, if you are using an n-squared algorithm when you could be using an order n one, for lists of 10 or so the "theoretical" cost is that it should be 10 times slower, but with bigger setup costs and trick compiler optimisations it might end up being faster. But if your lists are 10000 long then the basic cost of the algorithm is that it is going to take you 10000 times longer, and chances are startup costs are going to be insignificant. The reason why in[not(parent::TIME/in > .)] is O(n^2) is basically for each node in the list you are comparing with every node in the list. so the number of comparisons goes up by the square of the list length. Not every algorithm is order a power of n. If for example you could figure out a way to factorise a number in time that was limited by a power (any power) of the number of digits in the number then you would be (a) famous and (b) getting visits from your country's spying authorities wanting you to help them crack public key coding systems which mostly work on the principle that multiplying numbers together has complexity proportional to the size of the numbers being multiplied, but reversing the process by known algorithms is exponential. But the theoretical best complexity isn't known: look in google for NP-complete or "travelling salesman" for more on that than you ever want to know:-) David _____________________________________________________________________ This message has been checked for all known viruses by Star Internet delivered through the MessageLabs Virus Control Centre. For further information visit http://www.star.net.uk/stats.asp XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re[6]: Aggregate, Jeni Tennison | Thread | RE: apply-templates Q, Heather Lindsay |
RE: today's stylesheet?, Mike Ball | Date | Re: Node set comnparison, Nick Browne |
Month |