Re: [xsl] Merging and sorting files from a list

Subject: Re: [xsl] Merging and sorting files from a list
From: "cking" <cking@xxxxxxxxxx>
Date: Tue, 24 Aug 2004 18:36:57 +0200
Herve,

I haven't seen any reply to your message, 
so let me throw in what I found...

Monday, August 23, 2004 4:06 PM, you wrote:
>
> Hi all,
> I've been trying to get this to work for around a week and I can't seem
> to find the solution.
> 
> I'm parsing a list of file (from list.xml) that have the same
> architecture and I want to sort and merge them.
> 
> list.xml:
> <?xml version="1.0" ?>
> <listoffile>
> <wave wavepath="1.xml" />
> <wave wavepath="2.xml" />
> <wave wavepath="3.xml" />
> <wave wavepath="4.xml" />
> </listoffile>
> 
[big snip]
> 
> The documents are not entirely merged.....
> 
> Thanks in advance if you manage to find the bug!

I did find a few bugs, but I also found that your stylesheet
after all, just tries to copy nodes from the input files, without
really doing anything to "merge" them (as you can see in
the output you are getting).

I went looking in the FAQ, there's a chapter on merging:
http://www.dpawson.co.uk/xsl/sect2/merge.html
but all the examples are about merging _two_ input files,
whereas you want to merge an arbitrary number of files...
so maybe the problem is more complicated than you would
think at first.

I'm far from an expert myself, so I didn't have much hope but
I wanted to give it a try - and yes! I think I found a solution.
It's not a generalized solution (only works with the organisation
and tag names of your input files) and it involves several steps,
but it does give an output like the one you wanted.

In short, the idea is to transform the input files in two passes.
The first (pass1.xsl) collects all the nodes from all the input files, 
sorts them and puts them into an intermediate output file. Then,
the second pass stylesheet (pass2.xsl) merges the nodes.
In fact, I needed a third pass because of the nested <subpath>
elements (pass2 merges the component/subpath elements, then
pass3 merges component/subpath/subpath).

Here's the stylesheets:
(I posted all the files to my website so you can download them 
there; that will save you some copy & paste)
http://users.telenet.be/cking/webstuff/test/herve/

--- pass1.xsl ---

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

 <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

 <xsl:variable name="docs" select="document(/listoffile/wave/@wavepath)"/>

 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
  <PreVCD>
   <component name="stack">
    <xsl:for-each select="$docs/PreVCD/component/subpath">
     <xsl:sort select="@path" data-type="text" order="ascending"/>
     <xsl:copy-of select="."/>
    </xsl:for-each>
   </component>
   <dump>
    <xsl:for-each select="$docs/PreVCD/dump/time">
     <xsl:sort select="@t" data-type="number" order="ascending"/>
     <xsl:copy-of select="."/>
    </xsl:for-each>
   </dump>
  </PreVCD>
 </xsl:template>

</xsl:stylesheet>

--- pass2.xsl ---

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

 <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

 <xsl:strip-space elements="*"/>

 <xsl:template match="/PreVCD">
  <PreVCD>
   <xsl:apply-templates select="component"/>
   <xsl:apply-templates select="dump"/>
  </PreVCD>
 </xsl:template>

 <xsl:template match="component">
  <component name="{@name}">
   <xsl:apply-templates select="subpath"/>
  </component>
 </xsl:template>

 <xsl:template match="dump">
  <dump>
   <xsl:apply-templates select="time"/>
  </dump>
 </xsl:template>

 <xsl:template match="subpath">
  <xsl:variable name="path" select="@path"/>
  <xsl:if test="not(preceding-sibling::subpath[@path=$path])">
   <subpath path="{@path}">
    <xsl:copy-of select="*"/>
    <xsl:copy-of select="following-sibling::subpath[@path=$path]/*"/>
   </subpath>
  </xsl:if>
 </xsl:template>

 <xsl:template match="time">
  <xsl:variable name="t" select="@t"/>
  <xsl:if test="not(preceding-sibling::time[@t=$t])">
   <time t="{@t}">
    <xsl:copy-of select="*"/>
    <xsl:copy-of select="following-sibling::time[@t=$t]/*"/>
   </time>
  </xsl:if>
 </xsl:template>

</xsl:stylesheet>

--- pass3.xsl ---

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

 <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

 <xsl:strip-space elements="*"/>

 <xsl:template match="/PreVCD">
  <PreVCD>
   <xsl:apply-templates select="component"/>
   <xsl:apply-templates select="dump"/>
  </PreVCD>
 </xsl:template>

 <xsl:template match="component">
  <component name="{@name}">
   <xsl:apply-templates select="subpath"/>
  </component>
 </xsl:template>

 <xsl:template match="dump">
  <xsl:copy-of select="."/>
 </xsl:template>

 <xsl:template match="subpath">
  <subpath path="{@path}">
   <xsl:apply-templates select="subpath"/>
  </subpath>
 </xsl:template>

 <xsl:template match="subpath/subpath">
  <xsl:variable name="path" select="@path"/>
  <xsl:if test="not(preceding-sibling::subpath[@path=$path])">
   <subpath path="{@path}">
    <xsl:copy-of select="*"/>
    <xsl:copy-of select="following-sibling::subpath[@path=$path]/*"/>
   </subpath>
  </xsl:if>
 </xsl:template>

</xsl:stylesheet>

Processed with Saxon 6.5.3:
  saxon -o pass1.xml  list.xml  pass1.xsl
  saxon -o pass2.xml  pass1.xml pass2.xsl
  saxon -o output.xml pass2.xml pass3.xsl
gives this output:

--- output.xml ---

<?xml version="1.0" encoding="UTF-8"?>
<PreVCD>
   <component name="stack">
      <subpath path="stack_behavior">
         <subpath path="test2">
            <variable var="ins" symbol="#" wireonbus="1"/>
         </subpath>
      </subpath>
      <subpath path="stack_environment">
         <subpath path="test">
            <variable var="ins" symbol="!" wireonbus="1"/>
            <variable var="ins" symbol="@" wireonbus="2"/>
         </subpath>
      </subpath>
   </component>
   <dump>
      <time t="5">
         <symbol sign="!" value="0"/>
         <symbol sign="@" value="0"/>
      </time>
      <time t="10">
         <symbol sign="!" value="1"/>
         <symbol sign="#" value="1"/>
      </time>
      <time t="15">
         <symbol sign="@" value="0"/>
      </time>
      <time t="25">
         <symbol sign="!" value="0"/>
         <symbol sign="#" value="0"/>
      </time>
   </dump>
</PreVCD>

I hope this helps...

Best regards
Anton Triest

Current Thread