[xsl] Trying to understand root-less or document-node-less nodes

Subject: [xsl] Trying to understand root-less or document-node-less nodes
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Mon, 17 Sep 2007 12:15:47 +0200
Hi list people!

I stumbled across this ('this' is explained later) when I realized that

  <xsl:variable name="root">
       <test1 />
       <test2 />
   </xsl:variable>

can be queried by an xpath using the simple $root/* axis. In other words: children of $root. And that this felt non-analogous to the following:

   <xsl:function name="my:rootless">
       <test1 />
       <test2 />
   </xsl:function>

where you can, and should, query it with the equally simple my:rootless(), instead of, the more intuitive and analogous my:rootless()/*.

So far for the theory. I'm sure I tell you nothing new here. I use the term rootless because it sounds like ruthless, but I know a better word is probably "documentnodeless" because my hunch says this function does not return a document node (???) and no node exists without a root. For readability and rsi prevention I'll switch to calling it docless now (and I rename the function in the following examples). Some questions about this that struck me as odd:

 a)  count(my:docless()) returns 2
 b)  count(my:docless()/*) returns 0
 c)  count(my:docless()/root()) returns 2????
 d)  count(my:docless()/..) returns 0

Suppose the above is correct, which is understandable if you consider the function my:docless() to return two rootnodes (and thus, surpisingly, calling root() can return more than one), than why does the following not work?

 e) my:docless()/root()[1]  returns both nodes test1 and test2
 f)  my:docless()/root()[2] returns the empty sequence

Is it a matter of binding and precedence? My guess is that root() is called here for each node in my:docless() and returns a set of root nodes. Using parentheses around the node selection yields the correct result:

 g) (my:docless()/root())[1] returns node test1
 h) (my:docless()/root())[2] returns node test2

So far so good. In a way I can still follow my own analysis and it seems to fit on the theory I know from the XSLT 2.0 specification. The last things, about g/h should be in my standard toolkit of knowledge and I still often make mistakes with them (not using the parentheses when you need them).

If we have two root nodes, like in the example above, they are siblings of one another. But this does not appear to be correct (or does it? may be root nodes cannot have siblings by definition?). The following yields with Saxon's penultimate (as of today) version 8.9.3:

i) my:docless()/following-sibling::* returns the empty sequence
j) my:docless()[1]/following-sibling::* returns the empty sequence
k) (my:docless()/root())[1]/following-sibling::* returns the empty sequence


So, to use the following-sibling or the preceding-sibling axis (plural axes, axii?) you have to create a single root node around the root-siblings. This feels far from intuitive and redundant. But what I am wondering about more: is this correct behavior or am I just missing the obvious?

Thanks for any insightful insights ;)

Cheers,
-- Abel Braaksma

PS: sorry for the verbosity of this post. I sometimes lack the clarity and conciseness that characterizes the posts of some others of this list.

Current Thread