Re: [xsl] Optimizing XSLT iteration

Subject: Re: [xsl] Optimizing XSLT iteration
From: "Sujata Gohad" <sgohad@xxxxxxx>
Date: Sun, 7 Oct 2007 21:28:47 -0700
Hi Mike,

A few comments inline.

> Success: I've found the cause by looking at the Java CPU profile. The
> culprit is the calls on xsl:number at line 42 and 128. There are some
> circumstances in which Saxon optimizes xsl:number, but where it can't do so,
> it does essentially what the spec describes: each call on xsl:number has to
> count all the previous elements, which immediately gives you O(n^2)
> performance.

When I run XSLT profiler, it shows me that most of the time is being
spent in the <d> element creation, which corresponds to:
...
    <xsl:attribute name="d">
            <xsl:for-each select="locus[position()=1]">
                    <xsl:text>M </xsl:text>
                    <xsl:value-of select="@ellipse_x"/>
                    <xsl:text> </xsl:text>
                    <xsl:value-of select="@ellipse_y"/>
            </xsl:for-each>
            <xsl:for-each select="locus[position()!=1]">
                    <xsl:text> L</xsl:text>
                    <xsl:value-of select="@ellipse_x"/>
                    <xsl:text> </xsl:text>
                    <xsl:value-of select="@ellipse_y"/>
            </xsl:for-each>
    </xsl:attribute>
...

And there is no <xsl:number> tag being used here.

> You're using xsl:number to generate a unique id. I don't know what the
> constraints are on this id, but you can probably solve the problem using
> generate-id() or position(). If it absolutely must be the sequential number
> of the locus element within the entire file, consider using a prepass that
> numbers the elements (attaching the number as an extra attribute).
>

Probably I am using a wrong profiler and/or missing something obvious.

Please do let me know the tools you are using.

Many thanks,
- Sujata

>
> > -----Original Message-----
> > From: Michael Kay [mailto:mike@xxxxxxxxxxxx]
> > Sent: 08 October 2007 03:03
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: RE: [xsl] Optimizing XSLT iteration
> >
> > Thanks for sending your sample data (offlist).
> >
> > I've fixed various errors in your source file to make it well-formed.
> > Running the transformation 100 times in Saxon I get
> >
> > *** Average execution time over 100 runs: 26ms
> >
> > which seems entirely reasonable.
> >
> > It's dangerous to scale up from something quite so small, but
> > 38Kb in 26ms scales up to 76Mb in 52s if the performance is
> > linear, and I can't really see why it shouldn't be. The
> > processing in this stylesheet is a bit more complex than the
> > previously posted sample (includes string-to-number
> > conversion and arithmetic) but it looks linear.
> >
> > Michael Kay
> > http://www.saxonica.com
> >
> >
> > > -----Original Message-----
> > > From: Michael Kay [mailto:mike@xxxxxxxxxxxx]
> > > Sent: 08 October 2007 01:49
> > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > > Subject: RE: [xsl] Optimizing XSLT iteration
> > >
> > > > I just tried Saxon parser after increasing the initial/max
> > > Java heap
> > > > space to 512M and it worked with even a 74MB file, but took
> > > about 28
> > > > minutes to finish.
> > >
> > > That does seem quite high, given that Saxon can do an identity
> > > transform on a file of that size in something close to 28
> > seconds. I
> > > wonder if there is some other problem in your code (are there parts
> > > you haven't shown us?). Is the performance linear with
> > document size?
> > >
> > > Michael Kay
> > > http://www.saxonica.com/

Current Thread