RE: [xsl] Testing 2 XML documents for equality - a solution

Subject: RE: [xsl] Testing 2 XML documents for equality - a solution
From: Pieter Reint Siegers Kort <pieter.siegers@xxxxxxxxxxx>
Date: Wed, 30 Mar 2005 18:12:56 -0600
Also see http://apps.gotdotnet.com/xmltools/xmldiff/

I am using it to compare outputs and performance results from different
processors running in their native independent background processes,
encapsulated with C#.

There are others as well, among them these:

XML Diff and Merge Tool
https://secure.alphaworks.ibm.com/tech/xmldiffmerge

ExamXML
http://www.a7soft.com/index.html

Cheers,
<prs/> 

-----Original Message-----
From: Mukul Gandhi [mailto:mukul_gandhi@xxxxxxxxx] 
Sent: Wednesday, March 30, 2005 9:29 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: [xsl] Testing 2 XML documents for equality - a solution

Hello,
  I was playing with XSLT. I thought could there be a nice way (with XSLT
1.0) to test 2 XML documents for equality. Two XML documents will be
considered equal if all their nodes are identical(i.e. element, text,
attribute, namespace etc).

I found few approaches for this in the FAQ (URL -
http://www.dpawson.co.uk/xsl/sect2/N1777.html) .
Indeed they are good work.. But I could come up with an elegant way. It uses
no extension functions. Below is the XSLT .. 

<?xml version="1.0"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="1.0">
 
 <xsl:output method="text" />  
 
 <!-- parameter for "ignoring white-space only text nodes" during comparison
-->
 <!-- if iws='y', "white-space only text nodes" will not be considered
during comparison  -->  <xsl:param name="iws" />
 
 <xsl:variable name="doc1"
select="document('file1.xml')" />
 <xsl:variable name="doc2"
select="document('file2.xml')" />
 
 <xsl:template match="/">
 
    <!-- store hash of 1st document into a variable;
    it is concatination of name and values of all nodes -->
    <xsl:variable name="one">
      <xsl:for-each select="$doc1//@*">
        <xsl:value-of select="name()" /><xsl:value-of select="." /> 
      </xsl:for-each>
      <xsl:choose>
        <xsl:when test="$iws='y'">
          <xsl:for-each
select="$doc1//node()[not(normalize-space(self::text())
= '')]">
            <xsl:value-of select="name()"
/><xsl:value-of select="." /> 
          </xsl:for-each>
        </xsl:when>
        <xsl:otherwise>
          <xsl:for-each select="$doc1//node()">
	    <xsl:value-of select="name()" /><xsl:value-of select="." /> 
          </xsl:for-each>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:variable>  
    
    <!-- store hash of 2nd document into a variable;
    it is concatination of name and values of all nodes -->
    <xsl:variable name="two">
      <xsl:for-each select="$doc2//@*">
        <xsl:value-of select="name()" /><xsl:value-of select="." /> 
      </xsl:for-each>
      <xsl:choose>
         <xsl:when test="$iws='y'">
           <xsl:for-each
select="$doc2//node()[not(normalize-space(self::text())
= '')]">
             <xsl:value-of select="name()"
/><xsl:value-of select="." /> 
           </xsl:for-each>
         </xsl:when>
         <xsl:otherwise>
           <xsl:for-each select="$doc2//node()">
      	     <xsl:value-of select="name()"
/><xsl:value-of select="." /> 
           </xsl:for-each>
         </xsl:otherwise>
      </xsl:choose>
    </xsl:variable>  
    <xsl:choose>
      <xsl:when test="$one = $two">
        Equal
      </xsl:when>
      <xsl:otherwise>
        Not equal    
      </xsl:otherwise>
    </xsl:choose>
 </xsl:template>
 
</xsl:stylesheet>

In this stylesheet, I am relying on 2 features -
node() function and @* . node() function matches any node other than an
attribute node and the root node.
While @* matches any attribute. So I guess this XSLT can cater to all cases
;) . I have done limited testing with "element, text and attribute nodes
only"
and have got favourable results..

Another feature that I have incorporated in the stylesheet is, "controlling
whether white space only text nodes should be considered during comparison".
This is done with a stylesheet parameter iws. If it is "y", white space only
text nodes will be ignored during comparison. If it is other than "y" or is
not supplied, white space only text nodes will make a difference to the 2
documents.

If anybody cares to test this stylesheet and report any observations, I'll
be happy!

Regards,
Mukul



		
__________________________________
Do you Yahoo!? 
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/ 

Current Thread