[xsl] XPath grammar questions

Subject: [xsl] XPath grammar questions
From: Sean Russell <ser@xxxxxxxxxxxxxxxxxxxx>
Date: Sun, 17 Mar 2002 09:03:35 -0800
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello everyone,

Ya cain't do XSL w/o XPath, so I suppose this is as good a forum as any to ask 
these questions.  I'm subscribed to this list in digest form, so if you reply 
only to the list, allow for lag in my responses.

I'm writing an XPath parser (and evaluator) in Ruby, for an XML parser called 
REXML.  Actually, I've written the XPath parser three times already; this 
fourth time, I broke down and just implemented a lexer (more or less) 
conforming to the XPath grammar.  It works more or less properly, but I have 
a couple of places where it breaks down, and if there are any XPath gurus who 
can tell me how I'm misunderstanding the XPath spec, I'd appreciate the 
feedback.

The first case is in a path submitted by Tobias Reif, that originated, as I 
recall, from someone on this list:

 *[* and not(*/node()) and not(*[not(@style)]) and not(*/@style != */@style)]

Specifically, it's the 'not(*/node())' that I'm having trouble with.  The 
XPath spec states that:

  not( boolean ) -> boolean

This would imply that '*/node()' evaluates to a boolean.  However, it also 
states that paths such as:

  ancestor::node()

evaluates to a set of matching nodes.  Further, I had assumed that the path:

  */node()

by itself would also result in a set of nodes.

I have a group of theories about this, but I'm not quite grokking the intent 
of XPath.  I don't see how the same path should evaluate to two different 
results.  In any case, there have been a number of successful implementations 
of XPath, so I know I'm missing something.

The second (and at this point, more critical) problem I'm having is with 
function names.  Take:

  [normalize-space(@name)='x']

If you follow the grammar, the evaluation is:

   Predicate->Expr->OrExpr->AndExpr->EqualityExpr->RelationalExpr->
   AdditiveExpr

at which point it matches the rule:

  AdditiveExpr:: AdditiveExpr '-' MultiplicativeExpr

where you effectively have "normalize" "-" "space(@name)='x'".  What my code 
does at this point is hang; 'normalize' gets caught in an endless, recursive 
evaluation loop.  The only way I think I can solve this at this point is for 
checking for endless recursion.  I don't want to do this because it doesn't 
seem like I should have to... the grammar should be unambiguous.  Again, I 
suspect that my code is at fault, but when I run through the grammar by hand, 
I get the same result.  Rather, I suspect that I can avoid this particular 
recursive loop by changing the order of the rule evaluation, but then I get 
worse recursive loops in other paths.  There doesn't seem to be an elegant 
solution.

Any help would be appreciated.  If this isn't an appropriate topic for this 
list, feel free to email me directly at:

  ser@xxxxxxxxxxxxxxxxxxxx

Thanks!

- -- SER
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8lMxnURPYGmmGtGcRAtkrAJ0X/JHkKaWWMHr8o0GB/U1UhDTUbQCfSPxY
+Fi62m/vEgetC/ieWeUkId4=
=x3cC
-----END PGP SIGNATURE-----

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread