Re: [xsl] Comparing documents: what of P is a subset of D?

Subject: Re: [xsl] Comparing documents: what of P is a subset of D?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Thu, 27 Feb 2014 09:55:43 +0000
It would be easier to understand the problem with some example data.

Michael Kay
Saxonica

On 27 Feb 2014, at 08:05, Wolfgang Laun <wolfgang.laun@xxxxxxxxx> wrote:

> The data model for a set of similarly (but not identically) built XML
> documents is: a collection of arrays of records, which may contain
> (recursively) arrays, records and scalars. (The terms "array" and
> "record" are used in their "classic" meaning as, e.g., in Pascal.)
> Document structures are fairly stable, but they do change over time.
> Array elements are identified (indexed) by @_ix, not by position.
> Record fields can be elements or attributes (when they are scalar).
> Order is undefined, since XPaths plus @_Ix's pinpoint each node.
> 
> One XML document D contains a full population for such a data set
> (O(1MB)). A second XML document P contains "patches", i.e., each node
> appearing in P is expected to be in D as well.
> 
> If S(P) is the sequence of nodes (annotated with their XPaths) in P
> and S(D) the one with nodes from D, how can I determine S(P) intersect
> S(D) (except all @_ix, whose values are bound to be identical)? Of
> course, I don't want the common set of *data items* - I want the XML
> paths of those common data items.
> 
> A solution (in XSLT 2.0) should not need individual adaption for each
> kind of data set.
> 
> I'm confident that I can create text files for D and P containing one
> line <path> <value> for each node and run diff (after sort).
> 
> Any better ideas?
> 
> Cheers
> Wolfgang

Current Thread