Subject: [xsl] order-by vs xsl:sort From: David Carlisle <davidc@xxxxxxxxx> Date: Thu, 1 Sep 2005 17:21:24 +0100 |
In the interests of keeping the archives of this list accurate... Back in April under a thread of this name that I started, Michael said, quoting me: >> Not really. It models it as a sequence of integers of length M*N and >> calculates the sort key (and data needed for the result) by >> calculating >> the equivalent indexes into the original sequences using mod and idiv. >> >> given the general 2 variable case: >> >> >> for $i in $is, $j in $js >> order by f($i,$j) >> return >> g($i,$j) >> >> for some functions f and g then I think you can always replace this by >> >> let $ci :=count($is) return >> let $cj :=count($js) return >> for $n in (0 to $ci * $cj) >> let $i :=$is[$n mod $ci)+1] >> let $j := $js[($n idiv $ci) +1] >> order by f($i, $j) >> return >> g($i,$j) >> >> and once you have just a single for and order by, converting >> that to xsl >> for-each and sort is just syntax. >> > >Nice result. Perhaps I can get rid of those horrible tuples in my >implementation! Unfortunately the premise of my statement is wrong. The example is the general case for a cartesian product, but FLWOR loops actually model a dependent product where the sequence over which the inner variable is iterating can depend on the outer variable (and in particular can have varying length). This means that you can't pre-calculate the dimensions, so can't use a simple div and mod formula to generate the tuple indexes from a single integer. All is not lost though (I think) but it gets a bit more expensive. the general two variable case is for $i in $is, $j in F($i) order by f($i,$j) return g($i,$j) However I believe this may be rewritten as let $one := ( for $i at $ip in $is, $j at $jp in F($i) return encode($ip,$jp) ) return for $z in $one let $indexes:=decode($z) let $ip:=$indexes[1] let $i:= $is[$ip] let $jp:=$indexes[2] let $j:= (F($i))[$jp] order by f($i,$j) return g($i,$j) where encode is anything that encodes a sequence of integers as a single item and decode is anything that gets the sequence of integers back. Originally I had $one being a sequence of elements with the decimal values of the integers as content (<z>1 2</z>,<z>1 3</z>,... and decode() being a call to tokenize() but that seems rather expensive to generate all those temporary elements in a sequence just used once. So currently I encode the list of integers as a string using codepoints-to-string and back using string-to-codepoints. This requires a bit of offsetting to avoid banned codepoints like 0, and (since I use a single character per integers) limits me to a million or so entries per sequence, but could easily use more characters per integer. Main disadvantage of all this is that the expresions F($i) above that generate the sequnces get evaluated multiple times and might be expensive, especially in the case that they don't actually depend on the outer variable. I wonder if I could rely on the XSLT optimising complier to spot such repeated expressions and take them out of the loop, or whether an xquery to xslt conversion should do this as part of the conversion step... (the above only uses xquery notation but the point of it is to express FLWOR as xsl:for-each: once you have it rewritten as a loop with only one for-variable, writing as xslt is trivial. David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] problem passing parameter, Paula Estrella | Thread | [xsl] Template/for-each/Firefox, Brett Zamir |
Re: [xsl] stylesheet templates for , Emmanouil Batsis | Date | Re: [xsl] stylesheet templates for , David Carlisle |
Month |