Subject: Re: [xsl] Function for determining one XPath as subset of another From: "Liam R. E. Quin liam@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Tue, 26 Jan 2016 21:35:25 -0000 |
On Tue, 2016-01-26 at 16:15 +0000, Adam Retter adam.retter@xxxxxxxxxxxxxx wrote: > Given two simple XPaths, say: > > 1. //w > > 2. /x/y/z/w[@a = 'v'] >B [...] > I wondered if there might be an algorithm or library that someone > already had or has written which might be able to give me the answer? There's quite a bit of literature on optimizing XPath in the context of databases (e.g. in proc. VLDB, in "XML-Based Data Management and Multimedia Engineering" and elsewhere) but I don't know of a single coherent summary. This may be at least in part because what's useful and practical depends on the implementation and the situation. For example, a system with an element occurrence index might solve //w in O(n_w) time, where n_w is the number of occurrences of w elements in the corpus being searched; one using a tree structure might do better with the explicit path. A path like //a//b//c//d is much harder using tree navigation, but rewritten as d[ancestor::c[ancestor::b[ancestor::a]]] against an element index will likely run in time close to O(n_d log D) where D is the average depth of the tree. A system smart enough to notice that there are only three "c" elements in the corpus could do better than O(n_d) though (my text retrieval system did/does that to speed up multi-word phrase searches, for example). We encountered each of these implementation strategies during the development of XQuery. B The cost of building an element index for a non-database-backed XPath implementation is in some systems amortized by doing it during parsing, while the input XML is being read; systems that run XPath against a stored or in-memory tree might not have that luxury, or might have the even greater luxury of a precomputed index. I don't know of anything off the shelf, and although open source XQuery implementations might be a place to look, it's likely that optimizations are done on an internal representation of the XPath expression. > > I realise that I can only probably cover a subset of XPath itself, > but > it is only the path steps with predicates which I am interested in. > > Ideally I am looking for something in Java.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Function for determining , Christopher R. Maden | Thread | [xsl] xsl:for-each-group question, Markus Ohlenroth mar |
Re: [xsl] Function for determining , Christopher R. Maden | Date | Re: [xsl] Function for determining , Michael Kay mike@xxx |
Month |