[xsl] Checking for Cycles in a Graph

Subject: [xsl] Checking for Cycles in a Graph
From: "Darcy Parker" <darcyparker@xxxxxxxxx>
Date: Wed, 30 Apr 2008 21:28:39 -0400

I think I found a bug in some example code from XSLT 2.0 Programmer's
Reference by Michael Kay and would like to review it with Michael
and/or other XSLT programmers.  I have searched the list, checked the
errata for the book and googled to see if other's have had a
conversation on this before... but it seems to be undiscussed to date.

p.199 Checking for Cycles in a Graph:

<xsl:function name="graph:refers" as="xs:boolean">
  <xsl:param name="links" as="node()"/>
  <!-- $links is a node that represents the template to be called -->
  <xsl:param name="A" as="node()"/>
  <xsl:param name="B" as="node()"/>

  <!-- find the directly-connected nodes -->
  <xsl:variable name="direct" as="node()*">
     <xsl:apply-templates select="$links">
        <xsl:with-param name="from" select="$A"/>

  <!-- return true if B is directly or indirectly connected from A -->
  <xsl:sequence select="if (exists($direct intersect $B)) then true()
                        else if (some $C in $direct
                                 satisfies graph:refers($links,
$C,$B)) then true()
                        else false()"/>

I have defined a template for $links as required.  (But it is not
necessary to see the bug because I think the bug is in the higher
order function noted above.)

Typically cycles are checked for by evaluating:
graph:refers($something, $something).  If there is connection through
$something's direct links and their direct links and so on, then there
is a cycle.

The problem is that "graph:refers" may enter an infinite loop if
$direct contains a cycle.  Consider the case where $A does not refer
to $B but one of $A's direct links contain a cycle.  In the XPath for
xsl:sequence, the function recursively calls itself with
"graph:refers($links, $C, $B)".  $C becomes $A in the next iteration,
and its direct links are found in order to test if one of them refers
back to $B.  But notice that there is no check if $C contains a cycle!
 So it is possible that $C's direct links could be navigated
forever....without reaching $B first.  (I am actually experiencing
this problem with a large data set and haven't had time to make a
simpler use-case.  But I am hoping that people can follow the problem
generically with the thought exercise I just described.)

So it seems like $C the algorithm should check if $C contains a cycle
before testing "graph:refers($links,$C,$B)"... but after some
thinking, I am realizing that just because $C contains a cycle,
doesn't mean that $A doesn't refers to $B.  Perhaps the solution is to
test if $C contains a cycle, and if it does, remove the cycle(s) ->
creating $D.  And then test "graph:refers(l$inks,$D,$B)".

This seems complicated though... and I thought I should discuss this
issue with others before I continue to hack at this algorithm.  It
seems like this type of algorithm/problem would be well-known for

Has anyone explored this problem?  Any suggestions?


Current Thread