Re: [xsl] Grouping in match patterns

Subject: Re: [xsl] Grouping in match patterns
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 14 Jul 2020 20:04:51 -0000
I have added a test case match-0265 to the XSLT3 test suite. Saxon-JS is
getting it right, Saxon-J is failing.

The pattern syntax was generalized in many ways in XSLT 3.0 and this probably
isn't the only gap in the implementation...

Michael Kay
Saxonica

> On 14 Jul 2020, at 20:26, Wendell Piez wapiez@xxxxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi again,
>
> I am most grateful to the attention given to this so far. :-) What
> Liam saw is what I saw - according to the EBNF it appears to me it
> should be okay.
>
> FWIW, match="(b |c) / d" does seem to work at any rate in some
> versions of Saxon.
>
> Also, the error I see in (some version of) oXygen suggests to me that
> this is a known limitation, at least by some sense of "known". Not the
> nasty stuff reported by Norm from Saxon v.recent, but something easier
> on the eyes: "the path in a pattern must contain simple steps".
>
> I have also not returned to see what was the story under XSLT 2.0.
>
> Thanks for any further illumination!
>
> Cheers, Wendell
>
>
>
>
> On Tue, Jul 14, 2020 at 1:40 PM Liam R. E. Quin liam@xxxxxxxxxxxxxxxx
<mailto:liam@xxxxxxxxxxxxxxxx>
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx
<mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote:
>>
>> On Tue, 2020-07-14 at 15:34 +0000, Wendell Piez wapiez@xxxxxxxxxxxxxxx
>> wrote:
>>> XSL-List friends,
>>>
>>> Is there anything special I should know about a match pattern such as
>>> "a / (b|c)" -- which gives me an error (in oXygen and running Saxon)?
>>>
>>> <xsl:template match="a / (b | c)"/>
>>>
>>> Wouldn't it be permitted by the grammar given at
>>> https://www.w3.org/TR/xslt-30/#pattern-syntax? Production [11] would
>>> seem to permit a parenthetical expression as a discrete step. Is
>>> there
>>> something I am missing here?
>>
>> Nope, it should be allowed. I did a careful check against the grammar
>> although i am no Michael for grammars and completeness, nor David
>> Birnbaum for carefulness, so i append my analysis.
>>
>> I used
>> https://www.w3.org/TR/xslt-30/#pattern-syntax
>>
>> Analysis, probably flawed -
>>
>> / ( b | c)
>>
>> (A) not starting with . so it's a unionexpr [1]
>> Rule [3] gives us IntersectExceptExprP (| IntersectExceptExprP)*
>> and we don't have an | at top leve (nor "union") so we have
>>    IntersectExceptExprP
>>
>> This is PathExprP [4] wth optional suffix lacking here.
>>
>> (B) Rule [4] says PathExprP is
>>    RootedPath or / relativepath or // relativepath or RelativePathExprP
>>
>>    We dont' have a RootedPath (see rule [6]) so we have a
>>    RelativePathExprP.
>>
>> (C) Rule [11] says RelativePathExprP is
>>    StepExprP followed by zero or more of
>>        / StepExprP, or // StepExprP
>>
>> (D) Each StepExprP is [12] either a PostfixExprP or an AxisStepP
>>
>> A PostfixExprP is a parenthesized expression; our expression
>> does not have parens at the outer level, so we have an AxisStep.
>>
>> An AxisStep  can contain an AbbrevForwardStep, and we're sent off to XPath
>> to learn that this is a name optionally with @ in front of it.
>>
>> So we have an AbbrevForwardStep, as our first StepExprP in (C), and we
have
>> consumed the first token, the "a" in "a  / ( b | c )"
>>
>> Now,  we try and see if we have another StepExprP.
>>
>> We have a leading / which looks pomising.
>>
>> (E) What's left is (b | c)
>>
>>
>> ecall that StepExprP in [12] can be either a PostfixExprP or an AxisStepP;
>> what we have here is a PostfixExprP, which is defined in [13] to be
>> a ParenthesizedExprP followed by a PredicateListXP30, which is defined in
XPath.
>>
>> A quick check of Xpath says PredicateList is Predicate*, zero or more,
>> which makes sense, it's the [...] in a/b[...]/c... And it's optional, which
is
>> fine as we don't have one.
>>
>> So we have a ParenthesizedExpr, which is defined as a UnionExprP in
(parens).
>>
>> OK, we have parens, so now we must see if b | c matches a UnionExprP.
>>
>> (F) UnionExprP is defined in [3] to be
>>    IntersectExceptExprP (("union" | "|") IntersectExceptExprP)*
>>
>>    That is, an IntersectExceptExprP optionally followed by "| stuff".
>>
>> OK, so, IntersectExceptExprP is [4]
>>    PathExprP (("intersect" | "except") PathExprP)*
>> We don't have intersect of except in b | c, so we're matching b | c
>> against PathExprP.
>>
>> (G) PathExprP is [5] RootedPath or / stuff, or RelativePathExprP
>>
>> Neither a nor b starts with a / or $ or is any of the other things
>> allowed in a RootedPath. We don't have a / in "a | b", Se we had better
>> have a RelativePathExprP.
>>
>> (H) RelativePathExprP is [11] StepExprP (("/" | "//") StepExprP)*
>>
>> We don't have a / so we're matching a | b against StepExprP
>>
>> This is [12] PostfixExprP | AxisStepP
>>
>> A PostfixExprP is either (...) which we don't have, or an AxisStepP, which
can be
>> an abbreviated step, so "b" can match it.
>>
>> This leaves us with "| c"
>>
>> When we go back up to (F) we find out UnionExprP can have "|
IntersectExceptExprP"
>> after the "b", and we just found that "b" matched that, so the expression
matches.
>>
>> So, it's legal.
>>
>> I think.
>>
>> --
>> Liam Quin, https://www.delightfulcomputing.com/
>> Available for XML/Document/Information Architecture/XSLT/
>> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
>> Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org
>>
>
>
>
> --
> ...Wendell Piez... ...wendell -at- nist -dot- gov...
> ...wendellpiez.com <http://wendellpiez.com/>... ...pellucidliterature.org
<http://pellucidliterature.org/>... ...pausepress.org
<http://pausepress.org/>...
> ...github.com/wendellpiez <http://github.com/wendellpiez>...
...gitlab.coko.foundation/wendell...

Current Thread