Re: [xsl] Grouping in match patterns

Subject: Re: [xsl] Grouping in match patterns
From: "Wendell Piez wapiez@xxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 14 Jul 2020 21:07:15 -0000
Very many thanks Michael Kay and Saxonica!

Best regards, Wendell

On Tue, Jul 14, 2020 at 4:05 PM Michael Kay mike@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> I have added a test case match-0265 to the XSLT3 test suite. Saxon-JS is getting it right, Saxon-J is failing.
>
> The pattern syntax was generalized in many ways in XSLT 3.0 and this probably isn't the only gap in the implementation...
>
> Michael Kay
> Saxonica
>
> On 14 Jul 2020, at 20:26, Wendell Piez wapiez@xxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi again,
>
> I am most grateful to the attention given to this so far. :-) What
> Liam saw is what I saw - according to the EBNF it appears to me it
> should be okay.
>
> FWIW, match="(b |c) / d" does seem to work at any rate in some
> versions of Saxon.
>
> Also, the error I see in (some version of) oXygen suggests to me that
> this is a known limitation, at least by some sense of "known". Not the
> nasty stuff reported by Norm from Saxon v.recent, but something easier
> on the eyes: "the path in a pattern must contain simple steps".
>
> I have also not returned to see what was the story under XSLT 2.0.
>
> Thanks for any further illumination!
>
> Cheers, Wendell
>
>
>
>
> On Tue, Jul 14, 2020 at 1:40 PM Liam R. E. Quin liam@xxxxxxxxxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>
> On Tue, 2020-07-14 at 15:34 +0000, Wendell Piez wapiez@xxxxxxxxxxxxxxx
> wrote:
>
> XSL-List friends,
>
> Is there anything special I should know about a match pattern such as
> "a / (b|c)" -- which gives me an error (in oXygen and running Saxon)?
>
> <xsl:template match="a / (b | c)"/>
>
> Wouldn't it be permitted by the grammar given at
> https://www.w3.org/TR/xslt-30/#pattern-syntax? Production [11] would
> seem to permit a parenthetical expression as a discrete step. Is
> there
> something I am missing here?
>
>
> Nope, it should be allowed. I did a careful check against the grammar
> although i am no Michael for grammars and completeness, nor David
> Birnbaum for carefulness, so i append my analysis.
>
> I used
> https://www.w3.org/TR/xslt-30/#pattern-syntax
>
> Analysis, probably flawed -
>
> / ( b | c)
>
> (A) not starting with . so it's a unionexpr [1]
> Rule [3] gives us IntersectExceptExprP (| IntersectExceptExprP)*
> and we don't have an | at top leve (nor "union") so we have
>    IntersectExceptExprP
>
> This is PathExprP [4] wth optional suffix lacking here.
>
> (B) Rule [4] says PathExprP is
>    RootedPath or / relativepath or // relativepath or RelativePathExprP
>
>    We dont' have a RootedPath (see rule [6]) so we have a
>    RelativePathExprP.
>
> (C) Rule [11] says RelativePathExprP is
>    StepExprP followed by zero or more of
>        / StepExprP, or // StepExprP
>
> (D) Each StepExprP is [12] either a PostfixExprP or an AxisStepP
>
> A PostfixExprP is a parenthesized expression; our expression
> does not have parens at the outer level, so we have an AxisStep.
>
> An AxisStep  can contain an AbbrevForwardStep, and we're sent off to XPath
> to learn that this is a name optionally with @ in front of it.
>
> So we have an AbbrevForwardStep, as our first StepExprP in (C), and we have
> consumed the first token, the "a" in "a  / ( b | c )"
>
> Now,  we try and see if we have another StepExprP.
>
> We have a leading / which looks pomising.
>
> (E) What's left is (b | c)
>
>
> ecall that StepExprP in [12] can be either a PostfixExprP or an AxisStepP;
> what we have here is a PostfixExprP, which is defined in [13] to be
> a ParenthesizedExprP followed by a PredicateListXP30, which is defined in XPath.
>
> A quick check of Xpath says PredicateList is Predicate*, zero or more,
> which makes sense, it's the [...] in a/b[...]/c... And it's optional, which is
> fine as we don't have one.
>
> So we have a ParenthesizedExpr, which is defined as a UnionExprP in (parens).
>
> OK, we have parens, so now we must see if b | c matches a UnionExprP.
>
> (F) UnionExprP is defined in [3] to be
>    IntersectExceptExprP (("union" | "|") IntersectExceptExprP)*
>
>    That is, an IntersectExceptExprP optionally followed by "| stuff".
>
> OK, so, IntersectExceptExprP is [4]
>    PathExprP (("intersect" | "except") PathExprP)*
> We don't have intersect of except in b | c, so we're matching b | c
> against PathExprP.
>
> (G) PathExprP is [5] RootedPath or / stuff, or RelativePathExprP
>
> Neither a nor b starts with a / or $ or is any of the other things
> allowed in a RootedPath. We don't have a / in "a | b", Se we had better
> have a RelativePathExprP.
>
> (H) RelativePathExprP is [11] StepExprP (("/" | "//") StepExprP)*
>
> We don't have a / so we're matching a | b against StepExprP
>
> This is [12] PostfixExprP | AxisStepP
>
> A PostfixExprP is either (...) which we don't have, or an AxisStepP, which can be
> an abbreviated step, so "b" can match it.
>
> This leaves us with "| c"
>
> When we go back up to (F) we find out UnionExprP can have "| IntersectExceptExprP"
> after the "b", and we just found that "b" matched that, so the expression matches.
>
> So, it's legal.
>
> I think.
>
> --
> Liam Quin, https://www.delightfulcomputing.com/
> Available for XML/Document/Information Architecture/XSLT/
> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
> Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org
>
>
>
>
> --
> ...Wendell Piez... ...wendell -at- nist -dot- gov...
> ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org...
> ...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...
>
>
> XSL-List info and archive
> EasyUnsubscribe (by email)



-- 
...Wendell Piez... ...wendell -at- nist -dot- gov...
...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org...
...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...

Current Thread