Re: [xsl] Locating arbitrary duplicate structure

Subject: Re: [xsl] Locating arbitrary duplicate structure
From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx>
Date: Thu, 11 Jan 2001 21:54:56 +0000
Hi Daniel,

> Let's say I have XML with a well defined schema, but arbitrary
> hierarchy. Using XSLT, I need to identify branches that are
> identical, or that differ by a set number of attributes. I don't
> mind if the approach depends on extension script.

This approach starts from the bottom (the elements with no children)
and works up the hierarchy from there to find matches. It tells you
when there are differences in the presence of or values of attributes
and the presence of child elements of two nodes that are named the
same.

With your input, it gives the output:

<match first="/LinearFeatureModel[1]/Composite[1]/OffsetPath[1]/RegPopLinear[1]/Point[1]"
       second="/LinearFeatureModel[1]/Composite[1]/OffsetPath[2]/RegPopLinear[1]/Point[1]">
   <match first="/LinearFeatureModel[1]/Composite[1]/OffsetPath[1]/RegPopLinear[1]"
          second="/LinearFeatureModel[1]/Composite[1]/OffsetPath[2]/RegPopLinear[1]">
      <match first="/LinearFeatureModel[1]/Composite[1]/OffsetPath[1]"
             second="/LinearFeatureModel[1]/Composite[1]/OffsetPath[2]">
         <exception attribute="name"
                    first="Offset to the Left"
                    second="Offset to the Right"/>
         <exception attribute="offset"
                    first="-3.6"
                    second="3.6"/>
      </match>
   </match>
</match>

I think that this encodes most of the information that you wanted it
to.  Each match element compares two nodes.  For that pair, it then
goes on up the node tree, comparing their parent nodes and so on.
That way you find out if one branch contains another.  The exception
elements tell you about exceptions between elements when their names
are the same but their attributes aren't.

You could generate this node set as the value of a variable that you
then analysed in a separate step to get the output you wanted,
especially if you needed to do further analysis of it.

I hope that helps,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

----
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

<xsl:output method="xml" indent="yes" />

<xsl:key name="leaf-elements" match="*[not(*)]" use="name()" />

<xsl:template match="/">
   <xsl:apply-templates
      select="//*[not(*) and
              count(.|key('leaf-elements', name())[1]) = 1]" />
</xsl:template>

<xsl:template match="*">
   <xsl:variable name="node" select="." />
   <xsl:variable name="matches"
                 select="key('leaf-elements', name())[position() != 1]" />
   <xsl:for-each select="$matches">
      <xsl:call-template name="compare">
         <xsl:with-param name="node1" select="$node" />
         <xsl:with-param name="node2" select="." />
      </xsl:call-template>
   </xsl:for-each>
</xsl:template>

<xsl:template name="compare">
   <xsl:param name="node1" />
   <xsl:param name="node2" />
   <match>
      <xsl:attribute name="first">
         <xsl:apply-templates select="$node1" mode="xpath" />
      </xsl:attribute>
      <xsl:attribute name="second">
         <xsl:apply-templates select="$node2" mode="xpath" />
      </xsl:attribute>
      <xsl:for-each select="$node1/@*">
         <xsl:if test="not($node2/@*[name() = name(current())])">
            <exception attribute="{name()}" first="{.}" />
         </xsl:if>
      </xsl:for-each>
      <xsl:for-each select="$node2/@*">
         <xsl:if test="not($node1/@*[name() = name(current())])">
            <exception attribute="{name()}" second="{.}" />
         </xsl:if>
      </xsl:for-each>
      <xsl:for-each select="$node1/@*">
         <xsl:variable name="val"
                       select="$node2/@*[name() = name(current())]" />
         <xsl:if test="$val and . != $val">
            <exception attribute="{name()}"
                       first="{.}"
                       second="{$val}" />
         </xsl:if>
      </xsl:for-each>
      <xsl:if test="count($node1/parent::* | $node2/parent::*) = 2 and 
                    name($node1/parent::*) = name($node2/parent::*)">
         <xsl:call-template name="compare">
            <xsl:with-param name="node1" select="$node1/parent::*" />
            <xsl:with-param name="node2" select="$node2/parent::*" />
         </xsl:call-template>
      </xsl:if>
   </match>
</xsl:template>

<xsl:template match="*" mode="xpath">
   <xsl:for-each select="(ancestor::* | .)">
      <xsl:variable name="name" select="name()" />
      <xsl:text />/<xsl:value-of select="$name" />
      <xsl:text />[<xsl:value-of select="count(preceding-sibling::*[name() = $name]) + 1" />]<xsl:text />
   </xsl:for-each>
</xsl:template>

</xsl:stylesheet>
----



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread