[xsl] keys and collections

Subject: [xsl] keys and collections
From: "Birnbaum, David J" <djbpitt@xxxxxxxx>
Date: Fri, 22 Jun 2012 10:56:50 -0400
Dear XSLT-List,

Thanks to Michael and Wendell for the quick and helpful responses, which
revealed a detail a hadn't thought of (or, rather, understood). I'll
paraphrase that here to double-check whether I now have it right, trusting
that someone will jump in and correct any continuing misunderstanding.

I had asked about whether it was possible to define a key on a collection of
elements drawn from different documents (short answer: it isn't possible to do
this directly, but one can apply the same key to a sequence of all of the
documents and collect the results). It turns out, though, that my question
wasn't really (or wasn't just) about using a key across *documents*, as I had
originally thought; it was about using a key to refer into an arbitrary
*sequence of elements* scattered among different documents. The structure is
that I have a set of documents with a root element called <text>, the
subelements of which are <line> and <title>. I defined a variable to refer to
those <title> and <line> elements and then wanted to be able to find the
<line> elements by their @folio attributes:

<xsl:variable name="allInput"
select="collection('file:///path/?select=*.xml')/text/*"/>
<xsl:key name="linebyfolio" match="line" use="@folio"/>

The variable $allInput creates a sequence of <title> and <line> elements and
the key should return <line> elements upon matching their @folio attributes,
but the two pieces turn out not to work together, or, at least, not very
elegantly or very directly. Michael replied:

> A key can only be used to search within a single document. You can, of
course,
> construct the union of the results of searching each individual document
using
> the key, like this:
>
> $allInput/key('linebyfolio',$folio)
>
> I'm not sure what you were trying to do with the /text/* so I've left that
out of the equation.

That last comment sheds some light, I think, on why (or how) I misunderstood
the issue initially. Because I was thinking about creating a key for a
sequence of <title> and <line> elements, I had used the "/text/*" step to
collect only those. I (thought I) had no need for their parent, the root
<text> element, or for the document node above <text>. Since the variable as
defined builds a sequence of element nodes of type <line> and <title>, and not
of documents, it turns out that it isn't suitable for use with keys, at least
as I had originally thought. I can remove the "/text/*" piece, as Michael did,
or keep the variable as currently defined and add a step that retrieves the
roots of the documents in questions, as Wendell proposed:

> Assuming that $allInput were bound to an arbitrary set of elements scattered
across a set of documents, would there be any advantage to
>
> $allInput/root()/key('linebyfolio',$folio)

Both this solution and Michael's would seem to call key() once for each
document in the collection and then create a union of the results.

I suppose in theory I could also retain the "/text/*" step, with

$allInput/key('linebyfolio',$folio)

That would call key() once for each <line> or <title>, which is inefficient
(and rather substantially so), so Michael's suggestion of removing that path
step is a clear improvement. Wendell's approach would avoid that inefficiency
in a different way, by essentially undoing the "/text/*" step.

Thanks again,

David

Current Thread