Subject: [xsl] [answered] collecting multiple tokenize() results into one sequence|
From: Lars Huttar <huttarl@xxxxxxxxx>
Date: Thu, 24 Jul 2008 10:45:14 -0500
Hello, I spent a good while writing this post, then found the answer before I posted it! But I think I'll go ahead and post, in case it helps somebody (e.g. me) find the answer in the future when facing a similar problem.
I'm trying to take advantage of XSLT 2.0 features to create an index of keywords.
<items> <item> <rec>001</rec> <name>7-Zip</name> <meta>zip,compress,uncompress,rar,archive</meta> ...</item> <item>...</item> </items>
There are many <item> elements, each with a <meta> child. I want to tokenize the contents of the many //item/meta elements into one long sequence of strings. Then I can loop over the distinct values of the resulting sequence (in alphabetical order) to output an index section for each keyword.
My current attempt is: <!-- gather all tokens into one sequence of strings, then group by identical strings --> <xsl:for-each-group select="for $tags in //item/meta return tokenize($tags, ',')" group-by="."> <xsl:sort select="." /> <!-- alphabetical order --> <h2><xsl:value-of select="."/>:</h2> <ul> <xsl:for-each select="//item[contains(meta, .)]"> <xsl:sort select="name"/> <li>...
But this gives me (in Saxon 9B) the compile-time error Error on line 208 of file:....xsl: XPTY0020: Cannot select a node here: the context item is an atomic value
Line 208 is the xsl:for-each (not the for-each-group). I don't understand why it is a problem that the context item (which should be a string, the first thing in current-group(), right?) is an atomic value. The select of the for-each on line 208 does not depend on the context item, does it? I tried replacing "." with "current-grouping-key()" on that line, but it made no difference; same error.
============================== OK, after going around several iterations, I have gotten to the root of the matter: why does Saxon say "Cannot select a node here: the context item is an atomic value" for the xsl:for-each select="//item[contains(meta, .)]"? No doubt some of you already know this. I found the answer at http://www.oxygenxml.com/archives/xsl-list/200510/msg00444.html
Because "absolute paths" are not absolute at all: they select relative to the root of the tree containing the context node. You've got to know which document to look in.
Sure enough. You can't select "//item..." because that path is relative to the current document, which is determined from the current node, and is undetermined when the context item is not a node (e.g. just a string).
I guess the lesson here for me is to take Saxon's error messages more seriously; when they don't make sense, google them and find out what they mean.
A suggestion for improvement for Saxon: the error would be clearer if it changed the part that says "cannot select a node here". The latter seems misleading, and that only once you've figured out what it means. From what I understand now, you CAN select nodes "here" (if "here" means "with the context item being what it is", rather than some syntactic consideration); you just have to be more absolute, by specifying what document you're talking about: <xsl:for-each select="document('')//item[contains(meta, .)]"> or <xsl:for-each select="$all-items[contains(meta, .)]">
It may obvious to Michael Kay that "cannot select a node here" means you have to specify what document you're talking about, but I think most of the time, most of us don't even remember that "/..." isn't really absolute.
In fact, the XPath 1.0 spec specifically says "An absolute location path consists of / optionally followed by a relative location path." Then it explains (contradictorily, in light of what Michael Kay said above), "A / by itself selects the root node of the document containing the context node."
I see that "absolute" is not there in the XPath 2.0 spec. Also, the descriptions of "/" and "//" make the error conditions crystal clear:
A "|/|" at the beginning of a path expression is an abbreviation for the initial step |fn:root(self::node()) treat as document-node()/| (however, if the "|/|" is the entire path expression, the trailing "|/|" is omitted from the expansion.) The effect of this initial step is to begin the path at the root node of the tree that contains the context node. If the context item is not a node, a type error is raised [err:XPTY0020]. At evaluation time, if the root node above the context node is not a document node, a dynamic error is raised [err:XPDY0050]. (Similarly for "//".)
But who reads the new description of "/" if they already know XPath from 1.0?? :-)
How to clarify the error message? What about (borrowing language from the above paragraph): "Invalid initial / or // in path step: Cannot select the root node of the tree that contains the context node, because the context item is not a node."
It's a little long, but given that I'm not the first one who has been unhelped by the existing error message, wouldn't it be worth it to make this somewhat obscure problem clearer?