Re: [xsl] Proposed syntax for namespace binding in XPath

Subject: Re: [xsl] Proposed syntax for namespace binding in XPath
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Wed, 04 Apr 2007 20:22:23 +0200
Michael Kay wrote:
That's all detail to be worked out.

It turns out that using comments has the same disadvantages as they had with
the original XQuery pragma syntax: they're handled in the tokenizer which
has no knowledge of the syntactic context. So I may look at using a
different delimiter after all. But I was thinking of recognizing this syntax
only if it occurs right at the start of the expression.

I just read this thread and I find the proposal very valuable, partially because I have an application where people indeed are allowed to add there own XPaths and find themselves writing ugly things like: *[local-name() = 'xyz'] to bypass the namespace bindings. Which is of course bad practice.


In general, I saw two approaches come by: xmlns and special comments. But why introduce a new syntax and why using comments? I know, it is a popular way of writing extension mechanisms for existing standards (IE for one uses it in conditional comments), but parsers, highlighters, syntax checkers etc. are build around stripping comments.

These are all well-known drawbacks. But since this discussion seems geared towards using comments, I feel I have to add a third option. It has other drawbacks, of course, but at least it can be used outside comments and it won't interfere with existing functionality or standards: namely, using a no-op with special content. The basic idea is simple:

("your special syntax") and rest/of/expr
("your special syntax")[2] | rest/of/expr

For example, with namespace binding syntax, this could become:

("xmlns(xmlns=http://mynamespace xmlns:you=http://yournamespace)")[2] | path/with/you:your-namespace

The most obvious drawbacks are:

1. The need of having a no-op that costs some processing power on processor that do not understand the special syntax
2. Processors that are currently optimized to statically remove these no-ops will have to make an exception for these cases
3. Using a string containing the syntax is not quite pretty (but that argument can be used with comments as well)
4. Processors have to understand certain no-ops and recognize the syntax in it (but that should not be that hard if you formalize the syntax)


The most obvious positive points I can think of are:

1. No (major) change in parser engines, as it is part of the tokenized expression already
2. Use of existing syntax instead inventing new syntax
3. No violation of current XPath specification
4. Forward compatible. I.e., the "special comment" might be used once for a new construct, or some other vendor might make its own extensions; by using a sequence of one string however, will remain future-spec proof.
5. Using normal XPath escaping of strings, you can easily provide any value for the content (or syntax) of the special-syntax string.
6. Since it is part of the XPath, the user is free to choose to create the string dynamically.


The hardest challenge in this approach, I believe, is clearly defining the rules (and positions in the XPath) where this "featured no-op" may be used. I understood that one of the thoughts was to only use it on the start of the expression, which may greatly simplify this approach.

Cheers,
-- Abel Braaksma
  http://abel.metacarpus.com

Current Thread