Re: [xsl] Testing 2 XML documents for equality - a solution

Subject: Re: [xsl] Testing 2 XML documents for equality - a solution
From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx>
Date: Wed, 30 Mar 2005 08:40:58 -0800 (PST)
Hi David,
 Thanks a lot for your observations..
 
 Please read my response below your comments..

> I don't think the stylesheet really works.
> For example for attribute nodes you just concatenate
> the names and
> values so even if you could be sure that the order
> of attribute nodes
> was preserved (you can't be sure of this) then
> x="2" and x2="" would be considered equal.

Thanks a lot for pointing this bug! To correct this I
propose this alternative code (for both the
documents).
<xsl:for-each select="$doc1//@*">
  <xsl:value-of select="name()"
/><xsl:text>&#xa;</xsl:text><xsl:value-of select="."
/> 
</xsl:for-each>

(i.e. introducing an extra character between attribute
name and value, which is unlikely to occur in the
attribute value; for e.g. a newline character)

> Also your ignore white space test ignores far to
> much:
> 
> <xsl:for-each
>
select="$doc1//node()[not(normalize-space(self::text())
> = '')]">
>             <xsl:value-of select="name()"
> /><xsl:value-of select="." />
> 
> consider the 2 document fragments
> 
> <x>
>  <a/>
> </x>
> 
> 
> <y>
>  <b/>
> </y>
> 
> in the first document the nodes x and a and both the
> text nodes all
> satisfy
> normalize-space(self::text())= ''
> so the for-each will be empty.
> Similarly in the second fragment.
> 
> so presumably these documents will compare equal,
> which seems strange.

These documents are reported not equal! I think here I
am right! For this example, the $doc1//node() path
expression returns 4 nodes (2 element nodes and 2
"white space text nodes"). The "white space text
nodes" will be filtered by the predicate
[not(normalize-space(self::text()) = '')] ..

> Conversely you can not be sure that
> <x a="2" b="3"/> will compare equal to
> <x a="2" b="3"/>
> as teh attribute may be reported in one order for
> doc1 and teh other
> order for doc2.

I agree that the XML parser is not expected to report
attribute nodes in same order. But I guess we can
reasonably assume that a "specific XML parser" would
report attributes in same order. It must be having a
specific algorithm for this, whose outcome will be
predictable. I know I cannot theoretically prove
this.. But can you provide any practical evidence when
XML parser reports attributes in different order.. So
since 2 documents are being processed by the same
parser, the outcome will always be predictable!
I have tested the same example with a single product
multiple times, and always I am getting same result..

Regards,
Mukul

> David
> 
> 
>
________________________________________________________________________
> This e-mail has been scanned for all viruses by
> Star. The
> service is powered by MessageLabs. For more
> information on a proactive
> anti-virus service working around the clock, around
> the globe, visit:
> http://www.star.net.uk
>
________________________________________________________________________
> 
> 

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Current Thread