[xsl] Speed and memory problems when transforming links

Subject: [xsl] Speed and memory problems when transforming links
From: Andreas Voegele <voegelas@xxxxxxxxxxxxxxxxxxxxx>
Date: 06 Feb 2002 07:39:05 +0100
Hi,

I'd like to merge the contents of two linked XML files into on file
but I have speed and memory problems with Saxon as well as Xalan.

The first XML file contains syllables and links to the second file,
which contains phones:

 <syllable_file>
  <syllable id="sllbl_0">
   <link href="phone.xml#phn_0" />
   <link href="phone.xml#phn_1" />
  </syllable>
  ...
 </syllable_file>

 <phone_file>
  <phone id="phn_0" />
  <phone id="phn_1" />
  ...
 </phone_file>

I use the following stylesheet to merge both files:

 <?xml version="1.0" ?>
 <!-- Merge the syllable and the phone file -->
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  version="1.0">

  <xsl:output method="xml" indent="yes" />

  <xsl:template match="/">
   <xsl:element name="syllable_phone_file">
    <xsl:apply-templates select="/syllable_file" />
   </xsl:element>
  </xsl:template>

  <xsl:template match="syllable">
   <!-- Copy the syllable element and its attributes -->
   <xsl:copy>
    <xsl:copy-of select="@*" />
    <!-- Follow the links -->
    <xsl:for-each select="link">
     <xsl:for-each select="document(@href)/*">
      <!-- Copy the phone element and its attributes -->
      <xsl:copy>
       <xsl:copy-of select="@*" />
      </xsl:copy>
     </xsl:for-each>
    </xsl:for-each>
   </xsl:copy>
  </xsl:template>

 </xsl:stylesheet>

The result looks like this:

 <syllable_phone_file>
  <syllable id="sllbl_0">
   <phone id="phn_0"/>
   <phone id="phn_1"/>
  </syllable>
  ...
 </syllable_phone_file>

As long as there are only a few syllables the transformation works.
But if there are 1000 syllables and 2000 phones, Saxon and Xalan
require a lot of memory and the transformation gets very slow.

In my opinion 1000 records in an XML file aren't that much.  Both XML
files require less than 150 KBytes on disk.  It's hard to believe that
transforming 150 KBytes of data requires more than 100 MB of RAM.

Is there a better way to handle links in stylesheets?

Is there anything I can do to reduce the memory usage and to speed up
the transformation?

I've put a small archive that contains the stylesheet, a README and
all the other files required to test the stylesheet with Saxon and
Xalan at the following place:

http://www-stud.ims.uni-stuttgart.de/~voegelas/linktest.zip

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread