Subject: RE: [xsl] Testing 2 XML documents for equality - a solution From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx> Date: Sun, 3 Apr 2005 09:43:20 -0700 (PDT) |
Hi Mike (and all), I have attempted to define the problem (of comparing 2 XML documents). Its in a pdf form. Any one may access the file from location http://gandhimukul.tripod.com/comparing_xml_documents.pdf (size appx 198 KB). Some spelling and grammer errors are expected. It is unintentional. Any disrespect reflecting from such errors is unintentional. I have kept various messages of this thread intact below, so that it may help to understand the backgound of the problem, if somebody wishes to know. I have not done XSLT modifications to my earlier XSLT I posted. I'll do so after recieving feedback on this work.. All suggestions, debates and corrections are welcome.. I am keeping my fingers crossed. Regards, Mukul --- Michael Kay <mike@xxxxxxxxxxxx> wrote: > You're still struggling a bit. > > Let's start with requirements. What is this for? > This is part of the > difficulty: there are many reasons for wanting to > compare two XML documents, > and the different requirements don't necessarily > lead to the same > specification. If you describe some use cases this > will help you on the way. > For example, it will tell you whether it's enough to > give a boolean answer, > or whether you need to pinpoint where the two trees > differ. > > The next step is specification. This doesn't have to > be mathematical, but it > does have to be rigorous. Specifying it in terms of > a comparison of two > drawings of the trees being alike isn't going to be > helpful. I know what > you're getting at: you're trying to say that there's > a one-to-one > correspondence between the nodes and arcs in one > tree and the nodes and arcs > in the other. But you haven't said which properties > of the nodes are > important (namespace prefix? base uri? type > annotation?), you haven't said > how you will compare values (string comparison, with > or without Unicode > normalization? Collations? typed value comparison?), > and you haven't said > how you will handle the significance of ordering. > > Finally, implementation (which is where you > started). Before you embark on > an implementation you should have an idea of the use > cases (see above) and > their performance requirements. For example, is the > algorithm to be > optimized for comparing trees that are probably the > same or very similar, or > for comparing trees that are likely to be wildly > different? > > Sorry if this is a bit severe: but you did ask for > help. > > Michael Kay > http://www.saxonica.com/ > > > > > -----Original Message----- > > From: Mukul Gandhi [mailto:mukul_gandhi@xxxxxxxxx] > > > Sent: 31 March 2005 22:49 > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > Subject: Re: [xsl] Testing 2 XML documents for > equality - a solution > > > > Hi Dimitre, > > Below is the "scope" of my solution. My > definition > > of equality of XML documents consists of 2 parts: > > > > Part 1) Node types, to which the stylesheet does > > comparison > > ------- > > "XPath 1.0" trees define 7 kinds of nodes. These > are > > listed below. I have marked yes or no against node > > types, indicating whether my stylesheet has logic > to > > compare these nodes. If XML documents have nodes > of > > kind which are marked "no", then my stylesheet may > > give wrong result(I have not done any testing for > no > > marked nodes).. > > > > root nodes - yes > > element nodes - yes > > text nodes - yes > > attribute nodes - yes > > namespace nodes - no > > processing instruction nodes - no > > comment nodes - no > > > > Part 2) My notion of equality of 2 XML documents > > ------- > > Imagine that the XPath tree of 2 documents are > *drawn > > on paper*. The diagram is just similar to the > XPath > > tree diagram in Mike's book (XSLT 2nd Edition, > > Programmer's Reference) page 57(section "The Tree > > Model"). > > > > If XPath tree of 2 XML documents will "look same" > on > > paper (as in Mike's book's page 57), the documents > > will be considered equal by my stylesheet. > > > > The scope of my stylesheet presently covers only > these > > 2 points. > > > > I don't claim any other capability from my > stylesheet. > > > > I have not attempted to equate the XML documents > in > > terms of mathematical terms (like relations as you > > mentioned; the subject I don't understand well) or > > canonical terms(as defined by the canonical XML > spec). > > > > So considering the above scope of my work, can my > > stylesheet be evaluated for correctness? > > > > I have deep regard for people who participated on > this > > thread.. They surely have deep knowledge of the > > subject. > > > > Regards, > > Mukul > > > > --- Dimitre Novatchev <dnovatchev@xxxxxxxxx> > wrote: > > > Hi Mukul, > > > > > > > > > On Thu, 31 Mar 2005 04:36:32 -0800 (PST), Mukul > > > Gandhi > > > <mukul_gandhi@xxxxxxxxx> wrote: > > > > Hi Dimitre, > > > > I am really not good at mathematics at this > > > level. I > > > > did studied about relations like "symmetric, > > > reflexive > > > > and transitive" time back. But I did so just > to > > > score > > > > grades. I had no idea then their practical > use.. > > > It is > > > > indeed enlightening for me to know they have > real > > > > practical use (in XML & XSLT!). I cannot > define my > > > > problem in these terms.. As my knowledge is > > > limited. > > > > > > This confirms the conclusion that here we see > > > attempts at offering a > > > solution to a problem that is not well defined. > > > > > > How can we then judge the solution? > > > > > > > > > > > I would be happy if you can define in these > > > precise > > > > terms the problem I am trying to solve(based > on my > > > > earlier posts to this thread). > > > > > > Impossible. > > > > > > > I'll keep it as a > > > > reference for future use. I defined the > problem (I > > > am > > > > trying to solve) from an average programmer's > > > point of > > > > view.. And I think that it is quite > understandable > > > to > > > > an average programmer ;) > > > > > > A number of very wise people already explained > why > > > this is difficult > > > to define -- they also found holes in your > > > definition (and > > > understanding) of the problem. These people > > > obviously are not average > > > programmers. > > > > > > Cheers, > > > Dimitre Novatchev. > > > > > > > > > > > > > > __________________________________ > > Yahoo! Messenger > > Show us what our next emoticon should look like. > Join the fun. > > > http://www.advision.webevents.yahoo.com/emoticontest > > __________________________________ Do you Yahoo!? Take Yahoo! Mail with you! Get it on your mobile phone. http://mobile.yahoo.com/maildemo
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Testing 2 XML documents f, Michael Kay | Thread | RE: [xsl] Testing 2 XML documents f, Andrew Welch |
Re: [xsl] Select nodes with equal p, David Carlisle | Date | [xsl] xsl:key multiple select, Studio Codeworx |
Month |