Re: [xsl] String contains a regex and then junk ... how to remove the junk?

Subject: Re: [xsl] String contains a regex and then junk ... how to remove the junk?
From: "Norm Tovey-Walsh ndw@xxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 16 Dec 2024 13:43:04 -0000
> I wrote a recursive function to do this. See below. Is there is a simpler
way to do it?

If the regex is always in parens, and if the junk that follows never contains
a b)b, then just look for the last b)b.

If the regex is always in parens, but the junk might include b(b and/or
b)b then itbs going to be harder.

If the regex isnbt always in parens, b& Ibm not sure the problem is
tractable. A string of the form babcdb could be interpreted several ways
depending on whether bbcdb, bcdb, bdb, or bb is considered
junk.

On a quick skim, I wasnbt able to persuade myself that your recursive
solution was handling escaped parens, if thatbs an issue

Assuming the regex is always in parens, I cooked up this ixml grammar in a
moment or two, but it doesnbt handle escaped parens either.

text = regex, junk? .
regex = '(', inner*, ')' .
-inner = -regex | ~["()"] .
junk = ~[]* .

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh <ndw@xxxxxxxxxx>
https://norm.tovey-walsh.com/

> Weeks of programming can save you hours of planning.

[demime 1.01d removed an attachment of type application/pgp-signature which had a name of signature.asc]

Current Thread