Re: [xsl] performance of very very bad wtf-quality xslt

Subject: Re: [xsl] performance of very very bad wtf-quality xslt
From: bryan rasmussen <rasmussen.bryan@xxxxxxxxx>
Date: Fri, 30 Sep 2005 13:37:57 +0200
it's not really up to me to maintain it, what I'm discussing is the
prospect that the governmental organization which I am employed at
will publish it as an example of a way to handle this problem, if
someone else gets the xslt and uses it on our recommendation I would
consider it problematic. I've put in some suggestions for improving
it, hopefully they will be taken under consideration before it's
published.

On 9/30/05, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> This is a tricky situation. I'm a great believer in rewriting bad code if
> you become responsible for its maintenance. Apart from anything else, it's
a
> great way of learning about the code. But it can be hard to persuade a
> sceptical project manager that it's a useful thing to do. One reason PMs
are
> sceptical is that no programmer ever likes the code written by another,
they
> always want to rewrite it (the NIH syndrome); and doing so takes time away
> from other activities, and introduces bugs, especially if there's no
> regression test suite.
>
> Sometimes it's best to do the rewrite incrementally without asking for
> permission, as part of other changes you are asked to make to the code. In
> general, though, doing things surreptitiously is not very professional: you
> should convice the PM of your case.
>
> If you can justify the rewrite on performance grounds then by all means do
> so: but only if you have concrete evidence that the code is causing
> performance problems. It's not enough to know that you could improve the
> performance, you also have to know that the performance needs to be
> improved. From what you've shown us, however, the big problem with this
code
> is that it's a maintenance nightmare, not that it performs badly, and you
> should try to justify the rewrite on those grounds.
>
> Michael Kay
> http://www.saxonica.com/
>
> > -----Original Message-----
> > From: bryan rasmussen [mailto:rasmussen.bryan@xxxxxxxxx]
> > Sent: 30 September 2005 10:51
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: [xsl] performance of very very bad wtf-quality xslt
> >
> > I have access to an xslt that does an xml-to-xml transformation. I did
> > not write this xslt. It is the second worse xslt I have ever seen. The
> > worst xslt I have ever seen was written by the same company and had a
> > strategy of the following - match the root node, run through 6 if
> > statements to check what namespace the document element is in, output
> > one of two document elements dependent on these input namespaces,
> > within the if statements do 25 for-each statements equivalent to the
> > 25 allowed children of the document element, on each element matched
> > by a for-each statement do something like 7 for-each statements for
> > the 7 allowed attributes, in a couple of these attributes that you
> > want to compare with another attribute save the value of the attribute
> > in a variable, go up to the parent do another 7 for-each statements
> > through the attributes to find the one you want to match, do your
> > match, after which we are in the elements, match elements by for-each
> > statements, rinse, wash repeat, it is for-each statements all the way
> > down!
> >
> >
> > now I have not tested this absolute worst quality xslt on anything,
> > according to the people who wrote it they performed tests and it
> > worked fine, but my life is too short and it would probably be shorter
> > if I had to deal too much with that.
> >
> > The problem is that my employers decided to put it out on a web site
> > as a wonderful example of translating between these formats. Well now
> > the second worst xslt ever has come into my possession. It is somewhat
> > better. I doubt that there is any nesting of for-each any lower than
> > six levels. and there are some templates that have been defined. This
> > one is to do the transformation back from format 2 to format 1.
> >
> > here is an example fragment of this xslt, somewhat fixed to keep the
> > knowledge as to formats and purposes out of the discussion (it might
> > be politically inexpedient to discuss this with full knowledge on the
> > net):
> >
> >
> >
> > <xsl:template match="/n1:doc">
> >               <doc>
> >                       <xsl:attribute
> > name="xsi:schemaLocation">http://blah.com
> > http://blah.com/blah.xsd</xsl:attribute>
> >                       <xsl:variable name="doc" select="."/>
> >                       <xsl:for-each select="n1:ID">
> >                               <n2:ID>
> >                                       <xsl:value-of select="."/>
> >                               </n2:ID>
> >                       </xsl:for-each>
> >                       <xsl:for-each select="n1:Date">
> >                               <n2:Date>
> >                                       <xsl:value-of select="."/>
> >                               </n2:IssueDate>
> >                       </xsl:for-each>
> >                       <n2:Code>
> >                               <xsl:value-of select="$CodeValue"/>
> >                       </n2:Code>
> >                       <xsl:variable name="itexists">
> >                               <xsl:variable name="results">
> >                                       <xsl:for-each
> > select="n1:Currency">
> >                                               <xsl:value-of
> > select="true()"/>
> >                                       </xsl:for-each>
> >                               </xsl:variable>
> >
> > there is not as much for-each going on, but trust me these variables
> > they've defined. each one is about 7 lines of code too long.
> >
> > Anyway, they say they've tested as well.
> >
> > That's great. One problem is we don't have a good test set of
> > documents, some possible documents could be very big, I am worried
> > about performance problems applying to the following areas:
> >
> > 1. for-each as a whole
> > 2. nested for-each
> > 3. for-each over attributes, for example if one just wants to
> > transform a commonly occuring set of attributes I can't see how the
> > AST that comes out of an xslt could be optimized for doing that with
> > for-eaches, whereas it would be relatively easy with transforms.
> > 4. performance hits on various processors for if statements and
> > choose-when statements.
> > 5.Any other performance parameters that you can think of in this
> > context (one obvious one is that these stylesheets are approximately
> > 8-10 times larger than they need to be.)
> >
> > Thanks
> > Bryan Rasmussen

Current Thread