Re: [xsl] Testing 2 XML documents for equality - a solution

Subject: Re: [xsl] Testing 2 XML documents for equality - a solution
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Thu, 31 Mar 2005 22:05:03 +1000
On Wed, 30 Mar 2005 18:36:23 -0800 (PST), Mukul Gandhi
<mukul_gandhi@xxxxxxxxx> wrote:
> Hi Dimitre,
>  Please read my response below your comments..
> > Two XML documents will be considered
> > equal
> > > if all their nodes are identical(i.e. element,
> > text,
> > > attribute, namespace etc).
> >
> > This is not a precise definition of "document
> > equality".
> oh! Please don't take my definition of "document
> equality" from a pure mathematical view point. Its not
> as in "is 2=3 ?" . Did you got that impression from my
> definition? I meant that 2 XML documents will be equal
> if they have identical node structure. i.e. abstract
> structure of 2 documents should be identical and not
> at byte stream level(this was not my goal).

We are in a vicious circle here. You explain one undefined notion
("document equality") with two other undefined terms ("identical node
structure" and "abstract structure")...

Whenever one defines "equality", this means a symmetric, reflexive and
transitive relation on the set of X^2 of pairs of values from a set X.

One *must* define a breakdown of X^2 into classes of equivalence
(non-intersecting subsets of X^2 that cover X^2 completely (to put it
in other words: whose union is X^2)  ) . Then, by definition, every
pair of elements belonging to a class of equivalence are considered

Without doing this, one cannot speak about "equality" at all. There
are cases when more than one breakdown into classes of equivalence may
exist on the same set (e.g. the classes of equivalence on the set of
natural numbers N may be defined as all k+1 sets of numbers {x mod k =
r, where r = 0, 1, ..., k-1} In this case there are an infinite number
of different equivalent relations on N^2, just let k vary from 2  to
infinity). This example shows clearly that if you haven't defined
precisely about which equivalence relation you are speaking, then you
have no equivalence relation at all.

In this concrete case "document equality" remains undefined.
Therefore, the problem based on it is also undefined. Any activity to
solve an undefined problem is groundless and imaginary -- something
like hallucination.


> Another definition for the problem I am trying to
> solve would be, XML documents will be same if they
> *look similar* in a text editor like nodepad..
> so, document
> <x>
> <a i="1">
> </x>
> will be equivalent to
> <x>
> <a i="1">
> </x>
> but not to
> <x>
> <a i="2">
> </x>
> Yet another definition that applies to my problem
> would be ! 2 documents will be equal if they produce
> same output by an XSLT identity transform..
> The "same, equivalent" are better words than "equal"
> to the problem I was trying to solve..
> > Trying to solve an imprecisely formulated problem is
> > not a
> > well-founded and understood activity.
> True!
> > Generally, there is no solution to incorrectly
> > formulated problems,
> > therefore lets return to solving real problems.
> >
> :) Of course
> Regards,
> Mukul
> > Cheers,
> > Dimitre Novatchev
> __________________________________
> Do you Yahoo!?
> Yahoo! Small Business - Try our new resources site!

Current Thread