Subject: Re: [xsl] Speeding up processing (with sablotron or saxon) From: "TDarksword" <tdarksword@xxxxxxxxxxxx> Date: Tue, 13 Jul 2004 15:57:44 +0100 |
----- Original Message ----- From: "Wendell Piez" <wapiez@xxxxxxxxxxxxxxxx> To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx> Sent: Tuesday, July 13, 2004 12:03 AM Subject: Re: [xsl] Speeding up processing (with sablotron or saxon) > Hi, > > At 01:33 PM 7/12/2004, you wrote: > >ok I have a piece of XSLT that processes a large XML file into smaller > >chunks. The problem I have is that the deeper down into the XML file I am > >processing the longer it takes. Is this just due to the way XSLT parsers > >work or can I tweak my XSL file so it processes faster? > > > >I get the same effect when I used to process the file as one pass using > >Saxon Result:document as I do processing as seperate XSL files with either > >Saxon or Sablotron. > > > > > >This is the seperate file XSL file:- (Change the server[@name='Ahazi'] as > >needed) > ><?xml version="1.0"?> > ><xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > >version="1.0"> > ><xsl:output method="xml" indent='yes' encoding="utf-8"/> > > > ><xsl:template match="server" /> > ><xsl:template match="server[@name='Ahazi']"> > ><resources> > ><xsl:for-each > >select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]"> > > ... this for-each is expensive. You are traversing the entire document > looking for 'resource' elements; each one you find is examined by looking > at all its preceding elements and comparing their @swgcraft_id attributes. > When you have lots of elements, lots and lots of them are compared. (n^2 > performance.) > > Since this happens every time the template is matched (which could itself > be lots of times), it adds up -- especially for the later nodes in your set > (as you noticed). > > An easy tweak to improve performance would be to use keys to de-duplicate > instead of doing it by hand on the preceding:: axis. > > So: > > <xsl:key name="resource-by-id" match="resource" use="@swgcraft_id"/> > > <xsl:variable name="resources" select="//resource"/> > (binding //resource to a variable $resource so we don't have to retrieve it > every single time) > > then you can deduplicate in another variable declaration: > > <xsl:variable name="unique-resources" > select="$resources[not(count(.|key('resources-by-id',@swgcraft_id)[1]) > = 1)]"/> > > In English: $unique-resources is the collection of all resources which, > when counted along with the first resource with the same swqcraft_id as > themselves, amount to a single node (which is true only of the first one > with each swgcraft_id). > > This ought to help quite a bit. > > Cheers, > Wendell > So I'd replace the:- <xsl:for-each select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]"> with <xsl:key name="resource-by-id" match="resource" use="@swgcraft_id"/> <xsl:variable name="resources" select="//resource"/> <xsl:variable name="unique-resources" select="$resources[not(count(.|key('resources-by-id',@swgcraft_id)[1]) = 1)]"/> but I guess I still need some form of for-each statement too? TIA Tony
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Speeding up processing (w, Wendell Piez | Thread | Re: [xsl] Speeding up processing (w, Wendell Piez |
Re: [xsl] Using keys in templates, David Carlisle | Date | [xsl] implement attribute inheritan, Emmanouil Batsis |
Month |