Subject: Re: [xsl] XPath expression to check that there are no intervening elements? From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 20 Jul 2016 15:36:29 -0000 |
> On 20 Jul 2016, at 09:36, Michael Kay mike@xxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > I did some timings: > > > Kay 3 - 32.18ms (ouch!) > > doc ! (let $fsm := map{ > 0: function($x) {if ($x[self::B]) then 1 else 0}, > 1: function($x) {if ($x[self::B]) then 1 else 2}, > 2: function($x) {if ($x[self::B]) then 3 else 2}, > 3: function($x) {3} > } return > (fold-left(*, 0, function($state, $node){$fsm($state)($node)}) ne 3)) > I decided to do some analysis of this disappointing result. It turns out that dynamic function calls are quite expensive, and this is almost entirely due to the very dynamic nature of the type checking that is done (so-called function coercion). Running the above in a slightly different test environment I got a baseline time of 41ms. Java profiling showed an inordinate amount of time spent computing hashcodes. It turned out that during the dynamic call of the various "function($x)" functions, Saxon was checking that the supplied item type element(C) was a subtype of the required type item()*, which it does by first checking this pair of types in a cache of known type relationships, and the lookup in the cache involved an expensive hashcode computation. Avoiding the lookup by applying the knowledge that everything is a subtype of item()* reduced the total time to 29ms. I achieved a further reduction to 22ms by replacing the dynamic function call $fsm($state) with a static call map:get($fsm, $state). I then got this down to 13ms by avoiding the inline functions for computing the state transition, changing the query to declare namespace map='http://www.w3.org/2005/xpath-functions/map'; doc ! (let $fsm := map{ 0: (1,0), 1: (1,2), 2: (3,2), 3: (3,3)} return ( fold-left(*, 0, function($state, $node){ map:get($fsm,$state)[if ($node[self::B]) then 1 else 2] }) ne 3)) Even though there is now only one dynamic function call per element in the input sequence (and this is unavoidable) the cost still seems to be dominated by dynamic function calling. Declaring the types of the arguments ($state, $node) made this worse, because it increases the amount of dynamic type checking that needs to be done. This is quite different from the usual scenario where declaring types enables the checking to be done statically. There is clearly a lot of scope for optimization in this area. It's actually the first time I've looked seriously at the performance of higher order functions in a real query. Michael Kay Saxonica
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] XPath expression to check, Michael Kay mike@xxx | Thread | [xsl] Article: Debugging complex XS, Andriy Zakharchuk az |
Re: [xsl] Saxon-CE -- Passing Data , Michael Kay mike@xxx | Date | [xsl] Article: Debugging complex XS, Andriy Zakharchuk az |
Month |