Re: [xsl] Are XPath expressions parsed using compiler parsingtechniques?

Subject: Re: [xsl] Are XPath expressions parsed using compiler parsingtechniques?
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 10 May 2022 15:10:01 -0000
Hi Dr. Kay and Roger,

> XPath (and even more so XQuery) has a lot of ad-hoc rules for resolving
ambiguities, for example the rule that in the expression
>  (/ or /*)
>  , "or" parses as an element name, not as a binary operator
> (I don't think this ambiguity was even discovered for many years after
XPath 1.0 was published).

Just so that people be informed, this is not really a problem and my
generic LR-1 parser cited earlier correctly parses this expression without
having to resort to any tricks.

If there were any ambiguity, this would be caught and reported even by the
parse-table generator (yacc in this case). But the grammar published in the
XPath 2.0 specification is obviously well written and there are no
ambiguities.

Below is a fragment of the syntax tree produced parsing this expression. To
help you quickly see the important fragments:

                                                    <rule left="QName">
                                                        <right>or</right>
                                                    </rule>

:)

Here is the complete fragment of the produced syntax tree:

<rule left="PathExpr">
    <right>/</right>
    <right>
    <rule left="RelativePathExpr">
        <right>
            <rule left="RelativePathExpr">
                <right>
                <rule left="StepExpr">
                    <right>
                        <rule left="AxisStep">
                            <right>
                            <rule left="ForwardStep">
                                <right>
                                    <rule left="AbbrevForwardStep">
                                        <right>
                                        <rule left="NodeTest">
                                            <right>
                                                <rule left="NameTest">
                                                    <right>
                                                    <rule left="QName">
                                                        <right>or</right>
                                                    </rule>
                                                    </right>
                                                </rule>
                                            </right>
                                        </rule>
                                        </right>
                                    </rule>
                                </right>
                            </rule>
                            </right>
                            <right>
                            <rule left="PredicateList"/>
                            </right>
                        </rule>
                    </right>
                </rule>
                </right>
            </rule>
        </right>
        <right>/</right>
        <right>
            <rule left="StepExpr">
                <right>
                <rule left="AxisStep">
                    <right>
                        <rule left="ForwardStep">
                            <right>
                            <rule left="AbbrevForwardStep">
                                <right>
                                    <rule left="NodeTest">
                                        <right>
                                        <rule left="NameTest">
                                            <right>
                                                <rule left="WildCard">
                                                    <right>*</right>
                                                </rule>
                                            </right>
                                        </rule>
                                        </right>
                                    </rule>
                                </right>
                            </rule>
                            </right>
                        </rule>
                    </right>
                    <right>
                        <rule left="PredicateList"/>
                    </right>
                </rule>
                </right>
            </rule>
        </right>
    </rule>
    </right>
</rule>

Thanks,
Dimitre

On Mon, May 9, 2022 at 3:42 PM Michael Kay mike@xxxxxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> The XPath parser used in Saxon is a hand-written hybrid of a
> recursive-descent parser and a precedence parser.
>
> The very first version was actually derived from James Clark's xt parser,
> and over the years it evolved out of all recognition, but without ever
> being redesigned from scratch. I have no idea if throwing it away and
> starting again with a generated parser would give any benefits. I suspec
> that having a hand-written parser gives us more control over diagnostics
> and error recovery, and it also enables us to support multiple grammars
> (different versions of XPath and XQuery, plus XSLT patterns) within a
> single parsing framework.
>
> XPath (and even more so XQuery) has a lot of ad-hoc rules for resolving
> ambiguities, for example the rule that in the expression (/ or /*), "or"
> parses as an element name, not as a binary operator (I don't think this
> ambiguity was even discovered for many years after XPath 1.0 was
> published). Again, I think it's probably easier to implement such ad-hoc
> rules with a hand-written parser. But someone who understands a particular
> parser generator well could probably find a way to do it.
>
> Michael Kay
> Saxonica
>
> > On 9 May 2022, at 21:35, Roger L Costello costello@xxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Roger wrote:
> >
> >>> Are XPath expressions parsed using
> >>> compiler parsing algorithms?
> >
> > Michael Kay responded:
> >
> >> Yes, of course
> >
> > Hi Michael, does that mean each time Saxon encounters a place in an XSLT
> program where an XPath expression is expected, Saxon sends the expression
> into an XPath parser which tokenizes the expression, parses it into a
> syntax tree, and then traverses the tree to evaluate the expression? Did
> you use a parser generator to auto-generate the parser? If yes, which
> parser generator did you use? If you didn't use a parser generator, why not?
> >
> > /Roger
> >
> >
> 
>
>

-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread