Re: [xsl] Increasing sequence ?

Subject: Re: [xsl] Increasing sequence ?
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 27 Mar 2015 17:04:15 -0000
> Dimitre's recursive approach is only worth doing if the processor has
> non-constant performance for an expression of the form sequence[$N] where $N
> is an integer

I disagree -- with the word "only".

This algorithm works perfectly well in ***streaming mode***, (not to
be confused with the official XSLT 3.0 "streaming", which only handles
streaming of an XML document)  --  for very large or infinite (lazily
generated) sequences, where the count() function simply isn't
applicable and will either hang indefinitely or produce an error.

In case of such sequences, the traditional DVC (Divide and Conquer)
algorithm is also not exactly applicable -- it can be modified to a
streaming-DVC, also known as "chunking". This simply takes a chunk of
N consecutive items and uses the result of their processing in
processing the next chunk of up-to N items.

Should I say that "chunking" is also tail-recursive?

So, any non-standard XSLT implementation of sequences as array held in
memory, plus inability to treat tail-recursion properly, makes such
implementation of little use in dealing with indefinitely long or
lazily generated sequences.


Cheers,
Dimitre



On Fri, Mar 27, 2015 at 4:05 AM, Michael Kay mike@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> What we're seeing here is that the best solution is radically influenced by
> the optimization strategies within the processor.
>
> Dimitre's recursive approach is only worth doing if the processor has
> non-constant performance for an expression of the form sequence[$N] where $N
> is an integer; that is, if sequences are implemented as linked lists rather
> than arrays. Saxon will generally use an adaptive implementation where the
> sequence is held as an array as soon as you start indexing into it, so the
> non-recursive solution will work just fine. Other systems may differ.
>
> Then, if you do use the recursive approach, you run into problems if the
> processor doesn't do tail-call optimization. And if that's the case, you
> need to consider a divide-and-conquer approach instead.
>
> Michael Kay
> Saxonica
> mike@xxxxxxxxxxxx
> +44 (0) 118 946 5893
>
>
>
>
> On 27 Mar 2015, at 09:36, Leo Studer leo.studer@xxxxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Dimitre
>
> thanks, this is amazing. With Saxon EE in Oxygen 16.1 I get stack overflow
> with 10000 ;-).
> Can you compare the time with this solution?
> declare namespace my = "my:my";
> declare function my:increasing2($seq as xs:double*)as xs:boolean
> {every $v in 1 to (count($seq)-1) satisfies ($seq[$v] lt $seq[$v+1])};
> let $v:=(1 to 1000000) return (my:increasing2($v))
>
> Cheers
> Leo
>
> On 27.03.2015, at 05:24, Dimitre Novatchev dnovatchev@xxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi Leo,
>
> I ran this with BaseX 7.8.2:
>
> declare namespace my = "my:my";
> declare function my:increasing($seq as xs:double*) as xs:boolean
> {empty($seq[2])
> or
>  $seq[1] lt $seq[2]  and  my:increasing(subsequence($seq, 2))
> };
> let $v:=(1 to 10000)
>  return my:increasing($v)
>
>
> And here is the result (do note this below: - marking as ***tail
> call***: my:increasing(fn:subsequence($seq_0, 2))  )
>
> Total Time: 3.74ms (for 100 000 - long sequence the time was 17.77ms,
> for 1 000 000 - long sequence the time was 207.56ms)
>
>
> XSL-List info and archive
> EasyUnsubscribe (by email)
>
>
> XSL-List info and archive
> EasyUnsubscribe (by email)

Current Thread