Re: [xsl] how to estimate speed of a transformation

Subject: Re: [xsl] how to estimate speed of a transformation
From: David Tolpin <dvd@xxxxxxxxxxxxxx>
Date: Thu, 11 Dec 2003 16:50:09 +0400 (AMT)
> >
> > Is this advice true? If I need the same 'select' ten (or ten thousand)
> times in a program,
> > isn't it just simpler to write that select and let the implementation
> memoize it?
> 
> Yes it would be simpler, but what if the application isn't.

What if it is, but by binding 'select' results to global variables instead
of allowing the implementation to cache results when needed, you waste
memory? You cannot deallocate a variable in XSLT, can you? Thus if something
global is bound, it is forever.

> 
> > Its value will not change since the data is the same. Why should I
> manually
> > 'cache' values if the data model allows to do it consistently and
> automatically?
> >
> 
> I agree if applications would cache it automatically. I wouldn't have to do
> it by hand, but I'm don't write parser code and I have no idea how easy or
> difficult such a thing would be. However I do know, from other threads
> started by you actually, that many possible performance optimizations are
> not implemented by all parsers. Such is true, if I understand it correctly
> for tail recursion which is optimized by Saxon and jd.xslt but not by other
> parsers.

1) For C, the result of % with negative arguments is undefined.  Even if a parcticular 
implementation gives desirable result, one should not rely on this experience if
the program is intended for use with more than one compiler or more than one version
of the same compiler.

2) For Scheme, an implementation of tail recursion is required. An interpreter
that does not turn tail recursion into a loop does not conform to the Report. 

I am trying to find out what considerations  one can rely upon when designing a
transformation. I've found out that:

+ only direct (self-) tail recursion is properly turned into a loop in most implementations
  that support it at all, thus complex loops are problematic, but simple ones should work
  well with good implementations;
+ access by position to a nodeset exhibits constant time in advanced implementations (and linear in
  others);
+ results of expressions with predicates not referencing current() are memoized in many simple cases
  (but I cannot say what the sample case is).

Exploring the source code of an implementation of XSLT is a lot of fun. The authors are great engineers.
Could I please know what features I _can_ rely upon? I cannot rely on:

- writing repeating computations with iterators, since no mutable objects;
- writing repeating computations through recursion, since anything but the simplest case 
  is not supported anywhere;
- just specifying a predicate in a non-recursive way, since while some implementations
  will optimize it, others will compute it each time they meet it;
- subsetting a nodeset, since while in some implementations it is fast, in others it
  doubles memory for each removed node and takes considerable time;
- accessing a nodeset by position, since there are implementations with access time proportional
  to the position() in a nodeset.

XSLT is a great language. Its restrictions permit implementations which can execute fast
both 'declarative' (that is, with expressions describing relations, not algorithms -- such
expressions would take long to compute in Java, but can be computed economically in XSLT)
and 'functional' (that is, with mutliple calls of chaining computations) transformations.

On the other hand,  a basic, yet conforming, implementation (of which a number is currently widely used)
makes XSLT unconvenient without language extensions and fast processors. 

What do I have -- at least with good processors? What can I use in my stylesheets without
fear to turn constant into quadratic in a next version of the same processor?

David Tolpin




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread