Subject: Re: [xsl] Testing 2 XML documents for equality - a solution From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx> Date: Thu, 31 Mar 2005 09:32:05 -0800 (PST) |
Hi David, I solved the bug pointed by you below. Below is the modified stylesheet.. The new features in this version are: 1) Having a named template to calculate the XPath expression(in string format) for element nodes. So now the XPath of the node will also be included in the document hash. This will help in ensuring uniqueness of the node in the hash. 2) I am also concatinating the count of "ancestor-or-self & preceding" nodes (i.e. union of it) in the hash. This adds additional unique thing. This was neccessary because: 2.1 Only XPath expression was not sufficient, and 2.2 Only counting along ancestor-or-self axis was not sufficient. My algorithm is not namespace aware. I'll try this case later. I tested with these XML documents (which you posted below) <x> <y a="2"/> <y/> </x> <x> <y/> <y a="2"/> </x> They are reported Not equal While "same" documents are reported Equal I hope this version is better (and probably bug free).. <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="text" /> <!-- parameter for "ignoring white-space only text nodes" during comparison --> <!-- if iws='y', "white-space only text nodes" will not be considered during comparison --> <xsl:param name="iws" /> <xsl:variable name="doc1" select="document('file1.xml')" /> <xsl:variable name="doc2" select="document('file2.xml')" /> <xsl:template match="/"> <!-- store hash of 1st document into a variable; it is concatination of name and values of all nodes --> <xsl:variable name="one"> <xsl:for-each select="$doc1//@*"> <xsl:sort select="name()" /> <xsl:variable name="expr"> <xsl:call-template name="constructXPathExpr"> <xsl:with-param name="node" select=".." /> <xsl:with-param name="xpath" select="name(..)" /> </xsl:call-template> </xsl:variable> <xsl:value-of select="concat($expr,'/@',name(),':',.)" />:<xsl:value-of select="count(../ancestor-or-self::node() | ../preceding::node())" /> </xsl:for-each> <xsl:choose> <xsl:when test="$iws='y'"> <xsl:for-each select="$doc1//node()[not(normalize-space(self::text()) = '')]"> <xsl:variable name="expr"> <xsl:call-template name="constructXPathExpr"> <xsl:with-param name="node" select="ancestor-or-self::*[1]" /> <xsl:with-param name="xpath" select="name(ancestor-or-self::*[1])" /> </xsl:call-template> </xsl:variable> <xsl:value-of select="concat($expr,'/',name(),':',.)" />:<xsl:value-of select="count(ancestor-or-self::node() | preceding::node())" /> </xsl:for-each> </xsl:when> <xsl:otherwise> <xsl:for-each select="$doc1//node()"> <xsl:variable name="expr"> <xsl:call-template name="constructXPathExpr"> <xsl:with-param name="node" select="ancestor-or-self::*[1]" /> <xsl:with-param name="xpath" select="name(ancestor-or-self::*[1])" /> </xsl:call-template> </xsl:variable> <xsl:value-of select="concat($expr,'/',name(),':',.)" />:<xsl:value-of select="count(ancestor-or-self::node() | preceding::node())" /> </xsl:for-each> </xsl:otherwise> </xsl:choose> </xsl:variable> <!-- store hash of 2nd document into a variable; it is concatination of name and values of all nodes --> <xsl:variable name="two"> <xsl:for-each select="$doc2//@*"> <xsl:sort select="name()" /> <xsl:variable name="expr"> <xsl:call-template name="constructXPathExpr"> <xsl:with-param name="node" select=".." /> <xsl:with-param name="xpath" select="name(..)" /> </xsl:call-template> </xsl:variable> <xsl:value-of select="concat($expr,'/@',name(),':',.)" />:<xsl:value-of select="count(../ancestor-or-self::node() | ../preceding::node())" /> </xsl:for-each> <xsl:choose> <xsl:when test="$iws='y'"> <xsl:for-each select="$doc2//node()[not(normalize-space(self::text()) = '')]"> <xsl:variable name="expr"> <xsl:call-template name="constructXPathExpr"> <xsl:with-param name="node" select="ancestor-or-self::*[1]" /> <xsl:with-param name="xpath" select="name(ancestor-or-self::*[1])" /> </xsl:call-template> </xsl:variable> <xsl:value-of select="concat($expr,'/',name(),':',.)" />:<xsl:value-of select="count(ancestor-or-self::node() | preceding::node())" /> </xsl:for-each> </xsl:when> <xsl:otherwise> <xsl:for-each select="$doc2//node()"> <xsl:variable name="expr"> <xsl:call-template name="constructXPathExpr"> <xsl:with-param name="node" select="ancestor-or-self::*[1]" /> <xsl:with-param name="xpath" select="name(ancestor-or-self::*[1])" /> </xsl:call-template> </xsl:variable> <xsl:value-of select="concat($expr,'/',name(),':',.)" />:<xsl:value-of select="count(ancestor-or-self::node() | preceding::node())" /> </xsl:for-each> </xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:choose> <xsl:when test="$one = $two"> Equal </xsl:when> <xsl:otherwise> Not equal </xsl:otherwise> </xsl:choose> </xsl:template> <!-- a template to construct an XPath expression, for a given element node --> <xsl:template name="constructXPathExpr"> <xsl:param name="node" /> <xsl:param name="xpath" /> <xsl:choose> <xsl:when test="$node/parent::*"> <xsl:call-template name="constructXPathExpr"> <xsl:with-param name="node" select="$node/parent::*" /> <xsl:with-param name="xpath" select="concat(name($node/parent::*),'/',$xpath)" /> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="concat('/',$xpath)" /> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet> I'll also explain few other things which will help in understanding the algorithm easily.. 1) For attribute nodes, I am constructing XPath expression of their elements 2) The named template constructXPathExpr accepts 2 arguments: "element node itself" and "element node's name". 2.1 For non attribute nodes, parameters are written like this - <xsl:with-param name="node" select="ancestor-or-self::*[1]" /> <xsl:with-param name="xpath" select="name(ancestor-or-self::*[1])" /> (So we get the nearest element node along ancestor-or-self axis) 2.2 And for attribute nodes, it is written like - <xsl:with-param name="node" select=".." /> <xsl:with-param name="xpath" select="name(..)" /> (This points to the attribute's element) I'll be happy if you (or others!) can test the stylesheet further and report any defects.. I'll be obliged. Regards, Mukul > --- David Carlisle <davidc@xxxxxxxxx> wrote: > > > > <xsl:for-each select="$doc1//@*"> > > <xsl:sort select="name()" /> > > <xsl:value-of select="name()" > > />:<xsl:value-of > > select="." />:<xsl:value-of select="name(..)" > > />:<xsl:value-of > > select="count(../ancestor-or-self::node())" > /> > > </xsl:for-each> > > > > No. You can't use //@* for this at all. > > You have to do normalise the attributes for each > > element separately, ie > > inline the string from each attribute along with > the > > string for each > > element. > > <x> > > <y a="2"/> > > <y/> > > </x> > > is equal to > > <x> > > <y/> > > <y a="2"/> > > </x> > > > > by the above as you only reecord that the a > > attribute is on a level 2 y > > element, you don't record which element it is on. > > > > What is your definition of equality that you are > > trying to implement? > > This definition (even if corrected) is not > namespace > > aware so > > <x:foo xmlns:x="a"/> would be different from > <y:foo > > xmlns:y="a"/> > > but equal to <x:foo xmlns:x="b"/> > > so the definition of equality wouldn't be much use > > for any XPath use, > > two "equal" inputs would behave diffently as input > > to a stylesheet. > > > > David > > > > > ________________________________________________________________________ > > This e-mail has been scanned for all viruses by > > Star. The > > service is powered by MessageLabs. For more > > information on a proactive > > anti-virus service working around the clock, > around > > the globe, visit: > > http://www.star.net.uk > > > ________________________________________________________________________ > > > > > > > > __________________________________ > Do you Yahoo!? > Take Yahoo! Mail with you! Get it on your mobile > phone. > http://mobile.yahoo.com/maildemo > > __________________________________ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Testing 2 XML documents f, Mukul Gandhi | Thread | Re: [xsl] Testing 2 XML documents f, Mukul Gandhi |
Re: [xsl] Tricky inclusion match, Wendell Piez | Date | RE: [xsl] msxsl encoding bug?, Michael Kay |
Month |