Re: [xsl] improving performance in creating ids

Subject: Re: [xsl] improving performance in creating ids
From: "Pieter Lamers pieter.lamers@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 24 Apr 2019 14:22:41 -0000
Hi all,

In the end I found the solution for my original numbering plan in this xsl:number expression:

<xsl:number level="any" count="*[. &gt;&gt; $ancestor-with-id][@rid]"/>

the '>>' operator performs well enough (total processing time for the test book now 5 seconds) and was brought to my kind attention by Erik Siegel. Thanks for all your help.


On 24/04/2019 07:46, Pieter Lamers pieter.lamers@xxxxxxxxxxxx wrote:
Hi Wendell,

Had not seen your subsequent replies before I signed off last night. Your solution below involves a count which brings back my original performance problem. I think I will change my requirement for "locally" numbered ids somewhat so I can profit most from xsl:number. still, sad that 'from' cannot serve my purpose (or so it seems).

Hi Liam,

You are probably right that indexing + keys should work in the xquery solution. I'd have to dive a little further into that area before I can put it to use; my initial efforts did not make a change.

Thanks and all the best,

On 23/04/2019 23:47, Wendell Piez wapiez@xxxxxxxxxxxxxxx wrote:
Okay this is my next shot --

<xsl:value-of select="ancestor::*[exists(@id)][1]/@id || '-' || local-name() ||
count( key('elems-by-name',local-name(),ancestor::*[exists(@id)][1])[current()
.] ) + 1"/>
but after having done that I'd probably go back to xsl:number.

Partly since it's probably as fast, but mainly because declarative syntax rules.

(Note: still untested. Use at your own risk!)

Cheers, Wendell

On Tue, Apr 23, 2019 at 5:40 PM Wendell Piez wapiez@xxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
Oops, hit button too soon -- you'll see the error there.

I leave scoping the correct count as an exercise, but it's in there
somewhere! :-)

Cheers, Wendell

On Tue, Apr 23, 2019 at 5:39 PM Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote:
Hi again,

Also note if we had a key we would need no variable --

<xsl:value-of select="local-name() || '-'"/>
<xsl:number level="any" from="*[@id]" count="key('elem-by-name',local-name())"/>

which suggests we could also use the third argument of key() ...

<xsl:value-of select="local-name() || '-' ||

still not tested -- but ought to work, syntax errors aside --

Cheers, Wendell

On Tue, Apr 23, 2019 at 5:31 PM Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote:
Hey Pieter,

If performance were the issue, I might try factoring out the ID
labeling into a completely separate pass, in order (for example) to
implement it as a sibling traversal, passing parameters forward to
increment the ID values. (If your numbering is fancy, for example
scoping the increment to the element type as well as the ancestor, you
might have to pass a map forward.) I think that ought to be pretty
fast, plus it separates this logic from the other logic of the XSLT.
It's essentially like treating the XSLT engine like an overpowered SAX
parser. (Not that I would know how to make one of those.)

But this is only if xsl:number wasn't doing it, after I tried
something like what Martin H shows with plain old templates.

<xsl:variable name="ilk" select="local-name()"/>
<xsl:value-of select="$ilk || '-'"/>
<xsl:number level="any" from="*[@id]" count="*[local-name() eq $ilk]"/>

-- untested --

Cheers, Wendell

On Tue, Apr 23, 2019 at 10:57 AM Martin Honnen martin.honnen@xxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
On 23.04.2019 16:28, Pieter Lamers pieter.lamers@xxxxxxxxxxxx wrote:

Thanks for your quick reply. the node identity comparison helped quite a
bit, although I am still around a minute for a full book of ids. I am
not sure how xsl:number would help here, and what kind of performance
win it would give over count(). I tried something with a nested
transformation, but what should I feed it?

B B B B B <xsl:number select="*[last()]"/>
works (given a set of preceding nodes) but it is slightly slower than a
count() in the xquery. Maybe I should be using xsl:number differently?

It is difficult for me to suggest that without knowing the XML input
structure and whether you want to generate that id based on a count or
numbering only for certain nodes or some particular element type. In
general if I wanted to delegate counting to xsl:number similar to your
function I would define a template in a mode for that e.g.

B B B B  <xsl:template match="*" mode="number">
B B B B B B B  <xsl:number level="any" from="*[@id]"/>
B B B B  </xsl:template>

and then, where you need that number, you would use e.g.

B B B B <xsl:apply-templates select="." mode="number"/>

Both the template or the or the select of the apply-templates can of
course be adapted to more particular needs.

As for being more efficient that using count, that then depends on the
implementation but I would think there is some optimization to be
expected in an XSLT processor for xsl:number.

...Wendell Piez... ...wendell -at- nist -dot- gov...

...Wendell Piez... ...wendell -at- nist -dot- gov...

...Wendell Piez... ...wendell -at- nist -dot- gov...

Pieter Lamers
John Benjamins Publishing Company
Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The Netherlands
Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The Netherlands
Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The Netherlands
tel: +31 20 630 4747

Current Thread