Subject: RE: [xsl] Spotting "cousin marriages" in a tree From: "Michael Kay" <mhk@xxxxxxxxx> Date: Wed, 28 Jul 2004 23:57:33 +0100 |
You may be hoping for too much. Graph algorithms such as looking for cycles often have complexity of O(n^2) or worse, whatever language they are implemented in. Michael Kay > -----Original Message----- > From: Phil Endecott [mailto:spam_from_xslt_list@xxxxxxxxxxxx] > Sent: 28 July 2004 22:19 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] Spotting "cousin marriages" in a tree > > Dear XSLT experts, > > I have a flat input like this: > > <things> > <thing id="1"> <child idref="2"/> <child idref="3"/> </thing> > <thing id="2"/> > <thing id="3"/> > </things> > > I want to match up the ids with the idrefs to build a tree that looks > like this: > > <tree id="1"> > <tree id="2"/> > <tree id="3"/> > </tree> > > I could do it like this: > > <xsl:template match="thing"> > <tree> > <xsl:for-each select="child"> > <xsl:apply-templates > select="/things/thing[@id=current()/@idref]"/> > </xsl:for-each> > </tree> > </xsl:template> > > But that is O(n^2), so I'd instead do it like this: > > <xsl:key name="things-by-id" match="thing" use="@id"/> > <xsl:template match="thing"> > <tree> > <xsl:for-each select="child"> > <xsl:apply-templates select="key('things-by-id',@idref)"/> > </xsl:for-each> > </tree> > </xsl:template> > > which is O(n log n). > > This is all fine. But very rarely my input may not describe a pure > tree; it may contain "cousin marriages" or more "incestuous" > relationships like this: > > <things> > <thing id="1"> <child idref="2"/> <child idref="3"/> </thing> > <thing id="2"> <child idref="4"/> </thing> > <thing id="3"> <child idref="4"/> </thing> > <thing id="4"/> > </things> > > If you apply either of the above templates to this, thing 4 > appears in > the output twice: > > <tree id="1"> > <tree id="2"> > <tree id="4"/> > </tree> > <tree id="3"> > <tree id="4"> > </tree> > </tree> > > which is very bad in my application. What I need is > something like this: > > <tree id="1"> > <tree id="2"> > <tree id="4"/> > </tree> > <tree id="3"> > <error/> > </tree> > </tree> > > I can't afford for this to become an O(n^2) operation. > > The obvious thought is that I need to check at the point of recursion > whether the thing I am about to process has already been output. But > there is no "preceding-in-the-output" axis or other way I can > think of > to note which things have already been output. > > An alternative is to apply a test to the other input elements > to see if > other things have the same thing as this one as a child, but > that fails > because my input may legitimately contain redundant nodes like this: > > <things> > <thing id="1"> <child idref="2"/> <child idref="3"/> </thing> > <thing id="2"/> > <thing id="3"/> > <thing id="9"> <child idref="2"/> </thing> > </things> > > In this case I want the same output as the very first example, i.e. > thing 9 doesn't appear in the output at all. > > I can do a multi-pass process using exsl:node-set(), > generating the tree > with the duplicates in it and then deleting them if they are > duplicates, > i.e. something like this: > > <xsl:template match="/"> > <xsl:variable name="n"> > <xsl:apply-templates select="things/thing[1]"/> (as above) > </xsl:variable> > <xsl:apply-templates select="exsl:node-set($n)" > mode="remove-dupes"/> > </xsl:template> > > <xsl:template match="tree" mode="remove-dupes"> > <xsl:choose> > <xsl:when test="preceding::tree[@id=current/@id]"> > <error/> > </xsl:when> > <xsl:otherwise> > <tree id="{@id}"> > <xsl:apply-templates mode="remove-dupes"/> > </tree> > </xsl:otherwise> > </xsl:choose> > </xsl:template> > > There are two problems with this. First, generating the intermediate > result could run forever if the input is seriously malformed (e.g. a > loop, child refers to parent); checking as it is generated > would avoid > this. Second, the test "preceding::tree[@id=current/@id]" is O(n^2). > Can I use a key to avoid this? How can a key be applied to an > exsl:node-set() result? > > XSLT often needs some lateral thinking to come up with a good > solution. > I hope that someone who's read this far has a trick to > solve this for me. > > Thanks in advance! > > --Phil.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Spotting "cousin marriages" i, Phil Endecott | Thread | Re: [xsl] Spotting "cousin marriage, Jeni Tennison |
Re: [xsl] position= and blocks (was, Eliot Kimber | Date | Re: [xsl] position= and blocks (was, john farrow |
Month |