[xsl] Recursively processing a file

Subject: [xsl] Recursively processing a file
From: "Rowan Sylvester-Bradley" <rowan@xxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 10 Dec 2009 20:43:11 -0000
I'm trying to write a stylesheet which will recursively process an input
file. Let me explain the problem.
I want to create a file like this:

<root>
   <items>
      <item>
         <id>a</id>
         <type>lineitem</type>
         <name>Item A</name>
         <description>A nasty cheap one</description>
         <price>0.49</price>
      </item>
      <item>
         <id>a</id>
         <type>lineitem</type>
         <name>Item A</name>
         <description>This one's gold plated</description>
         <price>9.99</price>
      </item>
      <item>
         <id>b</id>
         <type>lineitem</type>
         <expand>Table1</expand>
         <name>Item B</name>
         <description>A nasty cheap one</description>
         <price>0.49</price>
         <item>
            <id>b1</id>
            <type>delivery</type>
            <description>Second Class</description>
            <price>0.25</price>
         </item>
         <item>
            <id>b1</id>
            <type>delivery</type>
            <description>First Class</description>
            <price>0.50</price>
         </item>
         <item>
            <id>b1</id>
            <type>delivery</type>
            <description>Express</description>
            <price>1.00</price>
         </item>
      </item>
      <item>
         <id>b</id>
         <type>lineitem</type>
         <expand>Table1</expand>
         <name>Item B</name>
         <description>This one's gold plated</description>
         <price>9.99</price>
         <item>
            <id>b1</id>
            <type>delivery</type>
            <description>Second Class</description>
            <price>0.25</price>
         </item>
         <item>
            <id>b1</id>
            <type>delivery</type>
            <description>First Class</description>
            <price>0.50</price>
         </item>
         <item>
            <id>b1</id>
            <type>delivery</type>
            <description>Express</description>
            <price>1.00</price>
         </item>
      </item>
      <item>
         <id>c</id>
         <type>lineitem</type>
         <name>Item C</name>
         <description>This one's even nicer</description>
         <price>4.99</price>
         <item>
            <id>c1</id>
            <type>delivery</type>
            <description>Second Class</description>
            <price>0.25</price>
         </item>
         <item>
            <id>c1</id>
            <type>delivery</type>
            <description>First Class</description>
            <price>0.50</price>
         </item>
         <item>
            <id>c1</id>
            <type>delivery</type>
            <description>Express</description>
            <price>1.00</price>
         </item>
      </item>
   </items>
</root>

I want to generate it from a file like this:

<root>
   <items>
      <item>
         <id>a</id>
         <type>lineitem</type>
         <expand>Table1</expand>
         <name>Item A</name>
      </item>
      <item>
         <id>b</id>
         <type>lineitem</type>
         <expand>Table1</expand>
         <name>Item B</name>
         <item>
            <id>b1</id>
            <type>delivery</type>
            <expand>Table2</expand>
         </item>
      </item>
      <item>
         <id>c</id>
         <type>lineitem</type>
         <name>Item C</name>
         <description>This one's even nicer</description>
         <price>4.99</price>
         <item>
            <id>c1</id>
            <type>delivery</type>
            <expand>Table2</expand>
         </item>
      </item>
   </items>
   <tables>
      <table>
         <id>Table1</id>
         <row>
            <description>A nasty cheap one</description>
            <price>0.49</price>
         </row>
         <row>
            <description>This one's gold plated</description>
            <price>9.99</price>
         </row>
      </table>
      <table>
         <id>Table2</id>
         <row>
            <description>Second Class</description>
            <price>0.25</price>
         </row>
         <row>
            <description>First Class</description>
            <price>0.50</price>
         </row>
         <row>
            <description>Express</description>
            <price>1.00</price>
         </row>
      </table>
   </tables>
</root>

The transform needs to do something like this:
1. Find all <item> elements in the source file.
2. If the item has an <expand> sub-item, copy the item to the output file
once for each <row> in the <table> element whose <id> matches the value of
the <expand> element, adding any elements found in the <row>.
3. <item>s that don't have an <expand> element are just copied to the output
file.
3. Keep doing this recursively until everything has been expanded. Since
<item>s can be nested to any depth, in principle an indefinite number of
nested expansions could be necessary.

This is a very simplified example of what I'm trying to do, but I think it
encapsulates the essence of the problem.

I've got a stylesheet that will do the expansion once, but I don't know how
to make it work recursively. So with the example shown, my stylesheet would
expand the items with id=a and id=b and id=c1, but it would not expand the
inner item with id=b1.

I think with a bit of refactoring I could make it possible to run the
transform multiple times, feeding the output of one transform into the input
of the next until there are no more changes, but ideally I'd like to have
one transform handle the recursion internally, not having to be run multiple
times.

Is this possible? How?

Is there a better way of handling this problem?

Note that these files can be 100Mbytes or more, so I need to be careful not
to run out of memory etc.

Many thanks - Rowan

Current Thread