Re: [xsl] Need an XPath expression which returns all xs:pattern elements containing a regex that permits an unbounded number of characters

Subject: Re: [xsl] Need an XPath expression which returns all xs:pattern elements containing a regex that permits an unbounded number of characters
From: "David Carlisle d.p.carlisle@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 4 Apr 2024 16:50:15 -0000
On Thu, 4 Apr 2024 at 17:10, Michael Kay michaelkay90@xxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> I don't think these rules handle the fact that `*` and `+` within square
> brackets are ordinary characters and do not need to be escaped.
>
> Michael Kay
> Saxonica
>
> > On 4 Apr 2024, at 16:46, Roger L Costello costello@xxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > David Carlisle devised a brilliant approach:
> >
> > Do a series of replace operations:
> >
> > remove all whitespace:
> >
> >    replace(@value,'\s','')
> >
> > replace \-quoted characters by x:
> >
> >    replace(@value,'\\.','x')
> >
> > replace {99,} constructs by *
> >
> >    replace(@value,'\{[0-9]+,\}','*')
> >
> > Here are the replaces, inlined:
> >
> > replace(replace(replace(@value,'\{[0-9]+,\}','*'),'\\.','x'),'\s','')
>

you have inlined these in the wrong order.
You need to remove space first  otherwise
{ 5, }
won't get replaced

but removing space will break \ *  turning it in to a quoted \*  it's all
fixable with care......

probably safer not to remove space but instead allow space in
'\{[0-9]+,\}','*') so '\{\s*[0-9]+\s*,\s*\}','*')

are you sure you want to do this at all? :-)



> >
> > Here are the results of applying the replaces to some regexes:
> >
> > A*  --> apply replaces --> A*
> > A+  --> apply replaces --> A+
> > A\*  --> apply replaces --> Ax
> > A\+  --> apply replaces --> Ax
> > A{0,} --> apply replaces --> A*
> > A{1,} --> apply replaces --> A*
> > A{5,} --> apply replaces --> A*
> > \\* --> apply replaces --> x*
> >
> > To implement "Find all xs:pattern elements that permit an unbounded
> number of characters" do this:
> >
> >       If the string resulting from applying the replaces
> >       contains * or +, then the regex permits an
> >       unbounded number of characters.
> >
> > David (or anyone), is this correct?
> >
> > /Roger

Current Thread