[xsl] Improving efficiency of an XPath expression?

Subject: [xsl] Improving efficiency of an XPath expression?
From: "Michael Beddow" <mbnospam@xxxxxxxxxxx>
Date: Fri, 7 Jun 2002 13:13:14 +0100
I have a document that consists of several thousand <lg> elements each with
varying numbers of <l> children. A simplified extract looks like this:

<example>

<lg id="G0065">
<l id="L0258">Putois li aynel afraie</l>
<l id="L0259">Gopil cleye, thesson traie</l>
<l id="L0260">Quant li venour li quer praie.</l>
</lg>

<lg id="G0066">
<l id="L0261">Ouwe jaungle, jars agroile</l>
<l id="L0262" >Ane en mareis jaroile</l>
<l id="L0263" >Mes il i ad jaroil e garoile</l>
<l id="L0264">La difference dire vous voile:</l>
</lg>

<lg id="G0067">
<l id="L0265">Li ane jaroile en rivere</l>
<l id="L0266">Si hom de falcoun la quere,</l>
<l id="L0267">Mes devant un vile en guere</l>
<l id="L0268">Afichom le garoil en tere</l>
</lg>

</example>

Now the challenge: given the id attribute value of any one <l>, find an
XPath expression that will efficiently select a node-set containing only:
the <lg> that immediately precedes the one where the <l> concerned is found,
plus the <lg> that is the parent of that <l> and the <lg> that immediately
follows the one that is the parent of the <l> specified. (i.e. "L2063" as
input id would return precisely the three <lg> elements in the example.)

Its the "efficiently" bit that's got me stuck. (NB because the number of
<l>s per <lg> is variable, there's no algorithm to derive the id's of
the parent, preceding and following <lg>s from a given <l>'s id)

I can do it OK with

select = '(//lg[following-sibling::lg/l/@id="$targetid"][last()] |
//lg[l/@id="$targetid"]) |
//lg[preceding-sibling::lg/l/@id="$targetid"][1])'

(hope I pasted that all right) but understandably enough even the zippy
libsxlt/libxml-2 takes its time over that one.

In practice (since I'm doing this in a sequence of server-side pipes) it
much faster to select simply on '//lg[l/@id="$targetid"]' to get the
parent node, then parse out the id of that node, derive the ids
of its preceding and  following siblings, go back in again to pull them
out separately, then concatenate the results for further processing.

But can any resident wizard give me a better XPath expression
that will still grab my target set in one go, only faster than my
offering above?

Michael
---------------------------------------------------------
Michael Beddow   http://www.mbeddow.net/
XML and the Humanities page:  http://xml.lexilog.org.uk/
The Anglo-Norman Dictionary http://anglo-norman.net/
---------------------------------------------------------


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread