Re: [xsl] Cheaper to prepend or append an item to a sequence?

Subject: Re: [xsl] Cheaper to prepend or append an item to a sequence?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Tue, 22 Feb 2011 14:20:42 +0000
On 22/02/2011 13:26, PQQP5QP;P0P2 P!P5P4P>P2 wrote:
ops... is sequences and items not presented as pointers and linked
items? arrays? and what about lazy evaluation in this case? not every
time when we work with sequence we need whole sequence and in most
cases we don`t even need it as numbered

Absolutely. A good implementation will not store an entire sequence in memory unless it has to, and will use lazy evaluation and early exit.


When a sequence does have to be stored in memory, there are various strategies one can use. Saxon generally uses arrays (contiguous blocks of storage) rather than linked lists - some people have noticed that as a result $x[N] has O(1) performance in Saxon, but O(N) performance in other implementations. Of course all such decisions are trade-offs; but I think that in XSLT, indexing into a sequence is a much more common operation than appending or prepending an item.

by the way - look like "except" keyword have same limitation - all
sequence need to be reconstructed instead of moving one of pointers
from one item to second

see this Schematron rule

     <pattern name="unique-id">
         <rule context="*[@id and not(ancestor-or-self::meta)]">
             <report test="@id = (//@id[not(ancestor::meta)] except @id)">
                 same @id
             </report>
         </rule>
     </pattern>

and look like 'except' is bottleneck here

In Saxon, the expression (//@id[not(ancestor::meta)] except @id) will be evaluated by scanning the elements of the document in document order (a very fast operation on the TinyTree), accessing the @id attribute of each one, testing to see whether it has an ancestor element called meta, and then rejecting it if it is the same node as @id. This should be a very fast operation. It doesn't involve building a sequence in memory (and in fact, it doesn't seem in any way related to the question on this thread). The most inefficient part of this is the test [not(ancestor::meta)] - it would be nice if Saxon were smart enough to remember while scanning the elements whether it has passed more meta start-tags than end-tags - but sadly, it isn't.


Michael Kay
Saxonica

Current Thread