Michael Kay wrote:
That's all detail to be worked out.
It turns out that using comments has the same disadvantages as they had with
the original XQuery pragma syntax: they're handled in the tokenizer which
has no knowledge of the syntactic context. So I may look at using a
different delimiter after all. But I was thinking of recognizing this syntax
only if it occurs right at the start of the expression.
I just read this thread and I find the proposal very valuable, partially
because I have an application where people indeed are allowed to add
there own XPaths and find themselves writing ugly things like:
*[local-name() = 'xyz'] to bypass the namespace bindings. Which is of
course bad practice.
In general, I saw two approaches come by: xmlns and special comments.
But why introduce a new syntax and why using comments? I know, it is a
popular way of writing extension mechanisms for existing standards (IE
for one uses it in conditional comments), but parsers, highlighters,
syntax checkers etc. are build around stripping comments.
These are all well-known drawbacks. But since this discussion seems
geared towards using comments, I feel I have to add a third option. It
has other drawbacks, of course, but at least it can be used outside
comments and it won't interfere with existing functionality or
standards: namely, using a no-op with special content. The basic idea is
simple:
("your special syntax") and rest/of/expr
("your special syntax")[2] | rest/of/expr
For example, with namespace binding syntax, this could become:
("xmlns(xmlns=http://mynamespace xmlns:you=http://yournamespace)")[2] |
path/with/you:your-namespace
The most obvious drawbacks are:
1. The need of having a no-op that costs some processing power on
processor that do not understand the special syntax
2. Processors that are currently optimized to statically remove these
no-ops will have to make an exception for these cases
3. Using a string containing the syntax is not quite pretty (but that
argument can be used with comments as well)
4. Processors have to understand certain no-ops and recognize the
syntax in it (but that should not be that hard if you formalize the syntax)
The most obvious positive points I can think of are:
1. No (major) change in parser engines, as it is part of the tokenized
expression already
2. Use of existing syntax instead inventing new syntax
3. No violation of current XPath specification
4. Forward compatible. I.e., the "special comment" might be used once
for a new construct, or some other vendor might make its own extensions;
by using a sequence of one string however, will remain future-spec proof.
5. Using normal XPath escaping of strings, you can easily provide any
value for the content (or syntax) of the special-syntax string.
6. Since it is part of the XPath, the user is free to choose to create
the string dynamically.
The hardest challenge in this approach, I believe, is clearly defining
the rules (and positions in the XPath) where this "featured no-op" may
be used. I understood that one of the thoughts was to only use it on the
start of the expression, which may greatly simplify this approach.
Cheers,
-- Abel Braaksma
http://abel.metacarpus.com