XPath's role (Was: Re: [xsl] Re: . in for)

Subject: XPath's role (Was: Re: [xsl] Re: . in for)
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Sun, 6 Jan 2002 11:58:17 +0000
Mike Kay wrote:
>I wrote:
>>  - I think that having cut-down FLWR expressions in XPath
>>    complicates XPath unnecessarily, when a simple mapping operator
>>    would fulfil the common requirements, and xsl:for-each or recursive
>>    user-defined functions or templates can handle the rest.
>
> We did think about this very carefully, and recognized that there is
> an step-increase in complexity, which one would rather avoid, at the
> point where you introduce range variables. You need range variables
> as soon as you want to do joins; and I think the need for joins will
> increase significantly once you allow manipulation of general
> sequences. My view is that XPath should be relationally complete,
> that you should never have to drop into XSLT to combine two
> sequences to produce a third sequence, and for that, range variables
> are definitely needed. This is part of ensuring that XPath can
> handle data-oriented XML (which often includes non-hierarchic
> relationships) as well as it currently handles document structures,
> which are predominantly hierarchical.

I agree that you need range variables to join two sequences in some
cases, and that if you adhere to the principal that you should never
have to drop into XSLT to combine two sequences to produce a third
sequence then you need range variables.

However, I am not yet convinced of that principal.

First, I do not think that XPath needs to be able to do everything all
by itself:

  "XPath is designed to be embedded in a host language such as [XSLT
   2.0] or [XQuery]."
                                      (From Section 1 of XPath 2.0 WD)

In my opinion, XPath should be kept simple - an expression language.
It should be able to do the same kinds of things as you can do on the
right side of a variable assignment in other programming languages,
though oriented towards traversing a node tree in order to access
information from XML.

XPath does not need to be able to do more than an expression language,
because these facilities should be available from the host language if
necessary. More, it *should not* do more, because to do so blurs the
line between it and the host language (causing confusion for those who
use it within the host language), and adds to the burden of
implementers and authors who use it in host languages where a full
programming language is not required (for example XPointer and
XForms).

Second, I think that the principal is ultimately unworkable, because
XPath cannot (as designed) create any sequence that you might want it
to create since you cannot use it to generate new elements (or other
nodes). I cannot use XPath to generate a sequence of token elements
from a sequence of strings, for example. I am not arguing that this
*should* be possible in XPath - merely pointing out that the host
language is an essential supplement to XPath in an environment where
it's used to create new documents.

Also, in many cases it will be much more practical to use something
more powerful than the for expression anyway (i.e. xsl:for-each or
FLWR expressions), because XPath does not include let clauses in for
expressions, so you can't assign variables between setting the range
variables and the return expression, making complex operations
inpractical.

[Yes, these things can be achieved within an XPath expression through
an extension function, but then so can a join. You can't have it both
ways. Either calling an extension function is part of XPath (in which
case joins are feasible without for expressions including range
variables) or it is dropping into XSLT (in which case XPath isn't able
to generate all the sequences you might want to create).]

Third, even if XPath being relationally complete is a worthwhile goal,
I question whether its worth is high enough to justify the cost in
terms of the additional complexity. Definitely joins are needed when
you are *generating* something based on some information from
somewhere else. However, the role of XPath is primarily not to
generate but to query.

  "The primary purpose of XPath is to address parts of an [XML]
  document."
                         (First sentence of Section 1 of XPath 2.0 WD)

You do not combine two sequences together for the sake of it. You join
them together so that you can produce some content from the join. To
produce that content, you must use XSLT or XQuery, and they have much
more advanced features to support joins, so why not use them in the
first place.

So, in practical terms, what is the benefit of doing a join in XPath?
What do you do with the resulting sequence? I think it's very rare
that you need to use the result of join other than to create nodes (or
portions of text nodes); in the vast majority of cases there is no
practical reason why the creation of the joined sequence has to be
done in XPath rather than in XSLT (or XQuery).

Sorry if that's a bit of a rant. It just frustrates me that XPath
seems to be developing into a cut-down XQuery rather than XQuery being
built on top of XPath (the XQuery WD even shares parts of the XPath WD
rather than referring to it). Let XQuery develop into the
strongly-typed optimiser's-dream of a query language for those that
need it, but let XPath remain an elegant, embeddable, usable, sweet
little expression language for the rest of us.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread