[xsl] The Solution (Was: Re: extracting sequences)

Subject: [xsl] The Solution (Was: Re: extracting sequences)
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Thu, 19 Feb 2004 22:19:23 +0100
"Saverio Perugini" <sperugin@xxxxxxxxxxxxxxxx> wrote in message
news:Pine.LNX.4.50.0402190914020.997-100000@xxxxxxxxxxxxxxxxxxx
> On Thu, 19 Feb 2004, Dimitre Novatchev wrote:
>
> > I don't see any graph described in your xml document.
>
> > Seems quite straightforward to me -- especially if you explain the
problem
> > properly.
> >
> > Could you draw the graph and explain how its vertices and arcs are
> > specified in your xml document? What is the meaning of a "crosslink"?
>
> Apologies for not being clear in my original post.
>
> The following is a textual depiction of the dag (directed acyclic graph)
> modeled by my data, where numbers represent nodes, letters represent
> link labels, and indentation models parent-child relationships.
>
> 1
>  -a- 2
>       -d- 5
>       -e- 6
>
>  -b- 3
>       -g- 7
>       -f- 6
>  -c- 4
>       -h- 7
>
> In the above graph, links f and h are crosslinks as their target nodes
> already exist in the graph.  They are like symbolic links (e.g., links
> prefaced which the '@' character in the Yahoo! taxonomy).  In other
> words, leaf nodes 6 and 7 are not in the graph twice. Rather they just
> each have two in-coming edges (and two parents) while every other node,
> with the exception of the root (node 1), has only 1 in-coming edge (and
> thus only one parent).  In my data crosslinks are modeled with the
> <crosslink> tag while all other links are modeled with the <link> tag.
>
> It can be seen that the paths through this dag (as I have defined path
> in my original post) are the following:
>
> --
> a d
> a e
> b g
> b f
> c h
> --
>
> and this is what I'd like to extract from the XML data.
>
> I hope that helps.

Yes, it helps.

First a remark that the structure of your xml document is very ugly and not
supporting an efficient and intuitive representation of a graph.

A good solution, therefore, would be in two passes:
  1. Transform the source xml to a standard graph representation (e.g.
graphML)
  2. Apply a simple transformation as in:

              http://lists.xml.org/archives/xml-dev/200401/msg00444.html

The direct (one pass) solution will necessarily be itself ugly, reflecting
its input ... :(


Here I provide a direct solution to your problem. Do note that it works for
any DAG (independent of its dept):

I have two xslt stylesheets:

getPaths.xsl
========
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
 xmlns:vendor="http://exslt.org/common";
 xmlns:r="http://www.w3.org/TR/RDF/";
 xmlns:d="http://purl.org/dc/elements/1.0/";
 >

 <xsl:output method="text"/>

 <xsl:key name="kNodeById" match="Node"
  use="@r:id"/>

  <xsl:template match="/">

  </xsl:template>

  <xsl:template name="getPaths">
    <xsl:param name="prdfGraph" select="/*"/>
    <xsl:param name="pNode1" select="/.."/>
    <xsl:param name="ppathsToNode1" select="/.."/>
    <xsl:param name="pNode2" select="/.."/>

    <xsl:choose>
      <xsl:when
       test="not($pNode1/*[self::link
                         or
                          self::crosslink
                          ])">
       <xsl:if test="generate-id($pNode1) = generate-id($pNode2)">
         <xsl:for-each select="$ppathsToNode1">
           <xsl:value-of select="concat(., '&#xA;')"/>
         </xsl:for-each>
       </xsl:if>
      </xsl:when>
      <xsl:otherwise>
        <xsl:for-each
        select="$pNode1/*[self::link
                         or
                          self::crosslink
                          ]">
          <xsl:variable name="vthisLink" select="."/>

          <xsl:variable name="varcName">
            <xsl:choose>
              <xsl:when test="self::link">
                <xsl:value-of select=
                "substring-after(@r:resource,
                                 concat(../@r:id,
                                        '/'
                                        )
                                 )"/>
              </xsl:when>
              <xsl:otherwise>
                <xsl:value-of select=
                "substring-before(@r:resource,
                                 ':')"/>
              </xsl:otherwise>
            </xsl:choose>
          </xsl:variable>

          <xsl:variable name="vrtfNewPaths">
            <xsl:call-template name="appendPaths">
              <xsl:with-param name="pPaths"
               select="$ppathsToNode1"/>
               <xsl:with-param name="pArcID"
                select="$varcName"/>
            </xsl:call-template>
          </xsl:variable>

          <xsl:variable name="vnewPaths"
           select="vendor:node-set($vrtfNewPaths)/*"/>

          <xsl:for-each select="$prdfGraph[1]">
            <xsl:call-template name="getPaths">
              <xsl:with-param name="prdfGraph"
               select="$prdfGraph"/>
              <xsl:with-param name="pNode1" select=
              "key('kNodeById', $vthisLink/@r:resource)
                   |
                    key('kNodeById',
                        substring-after($vthisLink/@r:resource,
                                       ':')
                        )"/>
              <xsl:with-param name="ppathsToNode1"
               select="$vnewPaths"/>
              <xsl:with-param name="pNode2" select="$pNode2"/>
            </xsl:call-template>
          </xsl:for-each>
        </xsl:for-each>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template name="appendPaths">
    <xsl:param name="pPaths" select="/.."/>
    <xsl:param name="pArcID" select="''"/>

    <xsl:for-each select="$pPaths">
      <xsl:copy>
        <xsl:value-of select="concat(., $pArcID, ' ')"/>
      </xsl:copy>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

and

testGetPaths.xsl
===========
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
 xmlns:r="http://www.w3.org/TR/RDF/";
 >
  <xsl:import href="getPaths.xsl"/>

  <r:myParam>
    <path/>
  </r:myParam>

  <xsl:template match="/">
    <xsl:for-each select="/*/Node[not(link)
                                and
                                 not(crosslink)]">
      <xsl:call-template name="getPaths">
        <xsl:with-param name="pNode1"
                        select="/*/Node[@r:id='Root']"/>
        <xsl:with-param name="ppathsToNode1"
                        select="document('')/*/r:*/*"/>
        <xsl:with-param name="pNode2" select="."/>
      </xsl:call-template>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

When the above transformation is applied on your source.xml (corrected to be
well-formed!):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
 xmlns:r="http://www.w3.org/TR/RDF/";
 >
  <xsl:import href="getPaths.xsl"/>

  <r:myParam>
    <path/>
  </r:myParam>

  <xsl:template match="/">
    <xsl:for-each select="/*/Node[not(link)
                                and
                                 not(crosslink)]">
      <xsl:call-template name="getPaths">
        <xsl:with-param name="pNode1"
                        select="/*/Node[@r:id='Root']"/>
        <xsl:with-param name="ppathsToNode1"
                        select="document('')/*/r:*/*"/>
        <xsl:with-param name="pNode2" select="."/>
      </xsl:call-template>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

the wanted result is produced:

a d
a e
b f
b g
c h


Hope this helped.


Cheers,

Dimitre Novatchev
FXSL developer,

http://fxsl.sourceforge.net/ -- the home of FXSL
Resume: http://fxsl.sf.net/DNovatchev/Resume/Res.html




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread