Re: [xsl] Friday challenge: XSLT thats creates XPaths for meaningfully equivalent comparisons of XML files

Subject: Re: [xsl] Friday challenge: XSLT thats creates XPaths for meaningfully equivalent comparisons of XML files
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Fri, 13 Apr 2007 08:28:44 -0700
Hi Andrew,

> 2. What is even more important, even if all issues described in 1.
> above have been solved/agreed-upon, the fact that the result of an
> XSLT 2.0 transformation is the same as the result of an XSLT 1.0
> transformation of a given document *does not guarantee* that the two
> transformations will have the same result when applied on another xml
> document.
>
> To put it in other words, the proposed tool will be effective in
> showing that two transformations do not produce the same results, but
> it cannot be used in ascertaining that two transformations will always
> produce the same result.

I agree, but I'm talking about infoset equivalence rather than lexical
equivalence.

cheers
andrew


Point 2. above has nothing to dowith lexical equivalence. It just says that we cannot conclude two transformations will produce equal/equivalent results on *any* document given we know they produce equal/equialent results on all documents we have so far tested them with.


To put it even simpler, this may be a good tool to say "No", but cannot be used to say "Yes".


-- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play



On 4/13/07, Andrew Welch <andrew.j.welch@xxxxxxxxx> wrote:
On 4/13/07, Dimitre Novatchev <dnovatchev@xxxxxxxxx> wrote:
> Some quick thoughts:
>
> > <checkXML>
> >   <xml src="file:/C:/test.xml">
> >      <check>/root[1]/foo[1]/text[1] = 'foo'</check>
> >      <check>/root[1]/foo[1]/@fooatt = 'att'</check>
> >      <check>/root[1]/bar[1]/text[1] = 'bar'</check>
> >      <check>/root[1]/bar[2]/text[1] = 'baz'</check>
> >   </xml>
> > </checkXML>
>
>
>   1. Checking if the above XPath expressions all evaluate to true() is
> not a guarantee that the two documents are the same. One of them could
> be a prefix (has all of the first N nodes in document order of the
> other document, but the other document has still more nodes after the
> "first N nodes").  Therefore, an essential XPath expression that is
> missing is:
>      count(//node() | //@* | //namespace::*)  = N
>
> This XPath expression illustrates also that according to our
> definition of "document equality" some of its subexpressions and the
> right-hand-side of the equality test above may differ when "equality"
> is defined in a different way  -- for example, do all attribute and
> namespace nodes matter, do we take into account comment nodes and/or
> processing instructions, ..., etc.
>
>  There are even such people, according to whom the following are different:
>
>     <someElement/>
>
> and
>
>     <someElement></someElement>
>
> and a lot of similar purely lexical differences (escaped text or
> CDATA, double or single quotes, explicit declaration of a namespace
> node inherited from the parent, order of attributes, ..., etc.)

Hi Dimitre,

The point of my exercise is not to guarantee that xml documents are
canonically identically, but that they are meaningfully equivalent.
I'm not talking about lexical similarity, but infoset similarity,
think HTML output.

This is just an idea, so I'm open to suggestions.

Given the task of upgrading a set of XSLT 1.0 transforms to XSLT 2.0,
how do you ensure the output remains consistent after the upgrade?
How can you be sure that the "improvements" you've made haven't broken
any existing transforms?  A canonical set of comparisons would flood
you with insignificant results, so you're more concerned with
"meaningful equivalent" results.  Perhaps I have the wrong approach
here...

I do use XSDs and Selenium tests, but I think there's room for another
tool that allows XSD, XPaths and XSLTs to check the correctness of the
XML.

> 2. What is even more important, even if all issues described in 1.
> above have been solved/agreed-upon, the fact that the result of an
> XSLT 2.0 transformation is the same as the result of an XSLT 1.0
> transformation of a given document *does not guarantee* that the two
> transformations will have the same result when applied on another xml
> document.
>
> To put it in other words, the proposed tool will be effective in
> showing that two transformations do not produce the same results, but
> it cannot be used in ascertaining that two transformations will always
> produce the same result.

I agree, but I'm talking about infoset equivalence rather than lexical
equivalence.

cheers
andrew

Current Thread