Subject: Re: [xsl] Testing 2 XML documents for equality - a solution From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx> Date: Mon, 4 Apr 2005 09:19:15 -0700 (PDT) |
--- David Carlisle <davidc@xxxxxxxxx> wrote: > > For the vast majority of nodes this is still a) very > expensive way of > comparing them and b) doesn't help with the > comparison. I agree ! I understand that generating the string hash of the entire XML document is a expensive operation.. If I reflect deeply, I would imagine that even if 2 XML documents are different, they may generate same concatenated string representation.. So my algorithm will probably fail in some cases. But I have no proof of my this new view. The XML examples with which I worked over my stylesheet, gave right answer as I expected. I'll test more to see if it shall fail for some cases.. > For a given element node if you calculate an XPath > to the current node, > and then use that XPath to find a node in the other > document, you have > two nodes, you then need to compare whether they are > equal, but that is > _exactly_ the problem you are trying to solve. The > earlier stylesheet > just took the string value of the node but that is > just the > concatenation of all the element content so loses > most of the markup > information. I think you are right! (as always :) ) > What is wrong with the much simpler alternative of > just writing out the > string corresponding to a specific "canonical" > linearisation, and then > jsut comparing those two strings? I think I should explore this option. But I believe that converting a XML document to canonical form is not a trivial task. For e.g. we need to convert documents to UTF-8 . i.e. if XML document has encoding ISO-8859-1 , then its canonical representation will have UTF-8 encoding .. (this I think cannot be easily accomplished with XSLT; infact I think it is impossible with XSLT?) . I think, there are also other canonicalization conversion rules which cannot be easily done with XSLT. I think by using a SAX parser, it is probably easier to convert XML to canonical form (ofcourse one must know all the rules as well!).. Regards, Mukul > David > > ________________________________________________________________________ > This e-mail has been scanned for all viruses by > Star. The > service is powered by MessageLabs. For more > information on a proactive > anti-virus service working around the clock, around > the globe, visit: > http://www.star.net.uk > ________________________________________________________________________ > > __________________________________ Yahoo! Messenger Show us what our next emoticon should look like. Join the fun. http://www.advision.webevents.yahoo.com/emoticontest
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Testing 2 XML documents f, David Carlisle | Thread | Re: [xsl] Testing 2 XML documents f, David Carlisle |
Re: [xsl] xsl: preserving spaces p, josh higgins | Date | Re: [xsl] Testing 2 XML documents f, David Carlisle |
Month |