Re: [xsl] Need an XPath expression which returns all xs:pattern elements containing a regex that permits an unbounded number of characters

Subject: Re: [xsl] Need an XPath expression which returns all xs:pattern elements containing a regex that permits an unbounded number of characters
From: "Edward Porter edward.porter@xxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 4 Apr 2024 12:51:00 -0000
Couldn't you just use a character group, something like this?

[a-zA-Z]+(\*|\+|\{\d(,.\})?)

That would match any number of letters followed by an *, +, or an unbounded
range but not allow an escape character in between the letters and the
quantifiers.

-----Original Message-----
From: Roger L Costello costello@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Thursday, April 4, 2024 8:29 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: [xsl] Need an XPath expression which returns all xs:pattern elements
containing a regex that permits an unbounded number of characters

EXTERNAL

Hi Folks,

I want to find, in an XML Schema, all xs:pattern elements containing a regex
that permits an unbounded number of characters.

Here are examples of xs:pattern elements that I want to find:

<xs:pattern value="A*"/>
<xs:pattern value="A+"/>
<xs:pattern value="A{0,.}"/>
<xs:pattern value="A{1,.}"/>

I do not want either of the following xs:pattern elements because -- due to
the escape symbol -- they do not permit an unbounded number of characters:

<xs:pattern value="A\*"/>
<xs:pattern value="A\+"/>

I created an XPath 2.0 expression to find the desired xs:pattern elements:

xs:pattern[
        contains(@value, '*') or
        contains(@value, '+') or
        contains(@value, '{1,}') or
        contains(@value, '{0,}')
    ]

Eek! That is not correct. It incorrectly returns the xs:pattern elements with
escaped asterisk and escaped plus symbols:

<xs:pattern value="A\*"/>
<xs:pattern value="A\+"/>

How to fix my XPath expression? Is the solution to add a second predicate:

xs:pattern[
        contains(@value, '*') or
        contains(@value, '+') or
        contains(@value, '{1,}') or
        contains(@value, '{0,}')
    ][
        not(contains(@value, '\*')) and
        not(contains(@value, '\+'))
    ]

Is that correct? Is that the best approach? Is there a better approach?

Bonus points if you can answer this question: Is my XPath expression catching
all xs:pattern elements that have a regex that permits an unbounded number of
characters?

Note: For reasons that I will not explain, the XPath expression must be an
XPath 2.0 expression.

/Roger

Current Thread