RE: [xsl] Speeding up processing (with sablotron or saxon)

Subject: RE: [xsl] Speeding up processing (with sablotron or saxon)
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Mon, 12 Jul 2004 19:09:19 +0100
The problem is this:

<xsl:for-each
select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]">

which scans all the preceding elements in the file.

This looks like a construct designed to eliminate duplicates, and it is
therefore amenable to Muenchian grouping using keys. This should improve the
performance dramatically.

Alternatively, in XSLT 2.0, use <xsl:for-each-group>

Michael Kay 

> -----Original Message-----
> From: TDarksword [mailto:tdarksword@xxxxxxxxxxxx] 
> Sent: 12 July 2004 18:34
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Speeding up processing (with sablotron or saxon)
> 
> ok I have a piece of XSLT that processes a large XML file into smaller
> chunks. The problem I have is that the deeper down into the 
> XML file I am
> processing the longer it takes. Is this just due to the way 
> XSLT parsers
> work or can I tweak my XSL file so it processes faster?
> 
> I get the same effect when I used to process the file as one 
> pass using
> Saxon Result:document as I do processing as seperate XSL 
> files with either
> Saxon or Sablotron.
> 
> 
> This is the seperate file XSL file:- (Change the 
> server[@name='Ahazi'] as
> needed)
> <?xml version="1.0"?>
> <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> version="1.0">
> <xsl:output method="xml" indent='yes' encoding="utf-8"/>
> 
> <xsl:template match="server" />
> <xsl:template match="server[@name='Ahazi']">
> <resources>
> <xsl:for-each
> select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]">
> <xsl:sort select="."/>
> <resource>
> <xsl:attribute name="swgcraft_id" >
> <xsl:value-of select="./@swgcraft_id"/>
> </xsl:attribute>
> <xsl:copy-of select="name"/>
> <xsl:copy-of select="type"/>
> <xsl:copy-of select="er"/>
> <xsl:copy-of select="cr"/>
> <xsl:copy-of select="cd"/>
> <xsl:copy-of select="dr"/>
> <xsl:copy-of select="fl"/>
> <xsl:copy-of select="hr"/>
> <xsl:copy-of select="ma"/>
> <xsl:copy-of select="oq"/>
> <xsl:copy-of select="sr"/>
> <xsl:copy-of select="ut"/>
> <xsl:copy-of select="pe"/>
> <xsl:copy-of select="verified"/>
> <xsl:copy-of select="available_timestamp"/>
> <planets_on>
> <xsl:variable name="resource" select="./@swgcraft_id" />
> <xsl:for-each select="../../..//resource[@swgcraft_id=$resource]">
> <xsl:value-of select="../../@name"/>
> <xsl:if test="position()!=last()">
> </xsl:if>
> </xsl:for-each>
> </planets_on>
> </resource>
> </xsl:for-each>
> </resources>
> </xsl:template>
> 
> </xsl:transform>
> 
> the old saxon XSL file that had the same effect is:-
> 
> <?xml version="1.0"?>
> <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> version="2.0">
> <xsl:output method="xml" indent='yes' encoding="utf-8"/>
> 
> <xsl:template match="server">
> <xsl:variable name="servername" select="substring(@name,1,4)"/>
> <xsl:variable name="filename"
> select="concat('file:///','d:/swg/crafterscorner/','currentres
> ources_',$serv
> ername,'.xml')"/>
> <xsl:result-document href="{$filename}">
> <resources>
> <xsl:for-each
> select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]">
> <xsl:sort select="."/>
> <resource>
> <xsl:attribute name="swgcraft_id" select="./@swgcraft_id"/>
> <xsl:copy-of select="name"/>
> <xsl:copy-of select="type"/>
> <xsl:copy-of select="er"/>
> <xsl:copy-of select="cr"/>
> <xsl:copy-of select="cd"/>
> <xsl:copy-of select="dr"/>
> <xsl:copy-of select="fl"/>
> <xsl:copy-of select="hr"/>
> <xsl:copy-of select="ma"/>
> <xsl:copy-of select="oq"/>
> <xsl:copy-of select="sr"/>
> <xsl:copy-of select="ut"/>
> <xsl:copy-of select="pe"/>
> <verified>
> <xsl:if test="substring(verified,1,1)='t'">
> <xsl:value-of select="'Y'"/>
> </xsl:if>
> <xsl:if test="substring(verified,1,1)='f'">
> <xsl:value-of select="'N'"/>
> </xsl:if>
> </verified>
> <xsl:copy-of select="available_timestamp"/>
> <planets_on>
> <xsl:variable name="resource" select="./@swgcraft_id" />
> <xsl:for-each select="../../..//resource[@swgcraft_id=$resource]">
> <xsl:value-of select="../../@name"/>
> <xsl:if test="position()!=last()">
> </xsl:if>
> </xsl:for-each>
> </planets_on>
> </resource>
> </xsl:for-each>
> </resources>
> </xsl:result-document>
> </xsl:template>
> 
> </xsl:transform>
> 
> the source XML file can be found at
> http://www.swgcraft.com/sendfile.php?file=currentresources.xml.gz
> 
> processing of the servers towards the start of the xml file 
> (eg Ahazi and
> Bloodfin) takes only a few seconds, but processing the ones 
> near the end of
> the xml file (eg Chimera and Infinity) takes a couple of minutes.
> Is there any way of speeding up the processing of the servers 
> near the end
> of the xml file?

Current Thread