topological sort

Subject: topological sort
From: Joerg Pietschmann <joerg.pietschmann@xxxxxx>
Date: Sat, 04 Nov 2000 14:48:13 +0100
Hello,
i want to implement a topological sort in XSLT. This is necessary
for generating program code (IDL for example) out of an XML file
which contains a specification for program structures, like the following:

<!DOCTYPE structs [
<!ELEMENT structs (struct*)>
<!ELEMENT struct (name,field*)>
<!ELEMENT field (name,type)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT type (ref|long)>
<!ELEMENT ref (#PCDATA)>
<!ELEMENT long EMPTY>
]>

<structs>
  <struct>
    <name>A</name>
    <field>
      <name>A1</name>
      <type><long/></type>
    </field>
  </struct>
  <struct>
    <name>B</name>
    <field>
      <name>B1</name>
      <type><ref>A</ref></type>
    </field>
    <field>
      <name>B2</name>
      <type><ref>C</ref></type>
    </field>
  </struct>
  <struct>
    <name>C</name>
    <field>
      <name>C1</name>
      <type><long/></type>
    </field>
  </struct>
  <struct>
    <name>D</name>
    <field>
      <name>D1</name>
      <type><ref>E</ref></type>
    </field>
  </struct>
  <struct>
    <name>E</name>
    <field>
      <name>E1</name>
      <type><ref>C</ref></type>
    </field>
  </struct>
</structs>


The data types for structure fields may refer to other structures.
Programming languages usually require the structures alredy defined
before they can be referenced. I don't want to put the burden on the
user to enter the specifications in proper order.
Preferably the algorithm should not be confused by reference loops.

The usual algorithm is to walk the dependencies while keeping a list
with already processed structure definitions. This, however requires
assignable variables for bookkeeping.
Another solution is to output the structures at levels of increasing
distance from the leaves in the dependency graph. This can be implemented
in XSLT but it is somewhat unelegant.

Some pseudo code for orientation:

processed_list=nil
template process
  generate structures which are not in the processed_list
                      and only refer to structure types not in the processed_list
  add the structure to the processed list
  call process

The recursion may be terminated either if the processed list contains all
structures (equivalently if it is exactly as long as the list with all
structures) or if no more structures has been processed in a step.
The first condition alone is not sufficient in case of loops which are
stopped by the second condition, however, the second condition alone
imposes an extra step. The following stylesheet uses only the second
condition. The concatenations with the double colons should guard against
spurious substring matches.

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
  <xsl:output method="text"/>
  
  <xsl:template match="structs">
    <xsl:call-template name="process">
      <xsl:with-param name="processed" select="':'"/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="process">
    <xsl:param name="processed"/>
    <xsl:for-each select="/structs/struct[not(contains($processed,concat(':',name,':')))
 and count(field[count(type/ref)>0 and not(contains($processed,concat(':',./type/ref,':')))])=0]">
      <xsl:value-of select="./name"/>
    </xsl:for-each>
    <xsl:variable name="actualprocessed">
      <xsl:for-each select="/structs/struct[not(contains($processed,concat(':',name,':')))
 and count(field[count(type/ref)>0 and not(contains($processed,concat(':',./type/ref,':')))])=0]">
        <xsl:value-of select="concat(./name,':')"/>
      </xsl:for-each>
    </xsl:variable>
    <xsl:if test="string-length($actualprocessed)>0">
      <xsl:call-template name="process">
        <xsl:with-param name="processed" select="concat($processed,$actualprocessed)"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

This outputs "ACBED" which is the correct dependency order.

Is there a better solution, or can the stylesheet be improved?
What bothers me most is that the struct elements must be processed twice,
one for output and again for updating the processed_list.

Thanks for help

J.Pietschmann

-- 

----------------------------------------------------------------------
Zuercher Kantonalbank ZKB        Internet : joerg.pietschmann@xxxxxx
Neue Hard 9                      Telefon  : ++41 01-275 85 03 
Postfach                         Fax      : ++41 01-275 80 34
CH-8010 Zuerich
----------------------------------------------------------------------


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread