RE: [xsl] how to sort a list of xpaths

Subject: RE: [xsl] how to sort a list of xpaths
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Sun, 20 Jan 2008 09:18:00 -0000
You don't actually say what sort order you want, but the implication is that
you want an order such that if one rule subsumes another (in the sense that
A subsumes B if the set of things matched by B is a subset of those matched
by A) then B should precede A in the sort order.

In the example you have given, it doesn't require a schema to determine that
//class subsumes /section/class or that /section/class subsumes
/section/class[@type='bb']. Of course, given a schema you can do more
sophisticated analysis, but I should start with the basics first.

Although you've expressed your problem in terms of xpaths, your examples are
all XSLT patterns, and you describe the semantics in terms of matching; so
another useful simplification would be to restrict yourself to patterns.

I think the problem then becomes tractable provided you don't try to be too
clever, for example trying to detect that A[contains(., 'abc')] subsumes
A[starts-with(., 'abc')]. However, it's not going to be easy. Your first
step is to get an XML representation of the expression tree (for example,
use XQueryX, or use the output of Saxon "explain"); then find a sort routine
that uses a callback to compare two items in the sequence; and implement
this callback to compare the two expressions. This would use a number of
rules for example that EXP subsumes EXP[P] and that //EXP subsumes PATH/EXP.
There might be quite a few of these rules.

I've actually been thinking of doing this kind of analysis for XSLT template
matching for a while, but there's nothing in Saxon that currently does it,
beyond the basic logic to calculate the default priority of the match
pattern.

Michael Kay
http://www.saxonica.com/  

> -----Original Message-----
> From: Mark Hutchinson [mailto:mark@xxxxxxxxxxx] 
> Sent: 20 January 2008 07:11
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] how to sort a list of xpaths
> 
> 
> I have an application that uses xpaths to identify which rule 
> that should be executed for a given tag in a source file.
> 
> eg given a source file of:
> <section>
>     <class type="AA">
>        <entry>entry text</para>
>     </class>
>     <class type="BB">
>        <entry type="italic">text text</para>
>        <entry>text text</entry>
>        <class>
>           <entry>subclass entry text</entry>
>        </class>
>     </class>
> </section>
> 
> My application might have the following rules :
> 
>    1. /section/class[@type="AA"]/entry
>    2. //class
>    3. /section/class[@type="BB"]/entry
>    4. //entry
>    5. //entry[@type="italic"]
> 
> My application searches the rules from the top down - 
> stopping from checking any further once a match has been 
> found. ie the first class
> (AA) matches with rule 1 and this is correct. The second 
> class (BB) matches with rule 2 which is, obviously, not correct.
> 
> Here's my question (finally!) - do anyone have a utility or 
> xslt, perl etc that can sort these rules based on a schema?
> 
> Regards
> Mark H.

Current Thread