Re: [xsl] How to escape the normal interpretation of parentheses when utilizing regex-group()?

Subject: Re: [xsl] How to escape the normal interpretation of parentheses when utilizing regex-group()?
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 24 May 2023 11:30:51 -0000
On 5/24/2023 1:21 PM, David Carlisle d.p.carlisle@xxxxxxxxx wrote:
>
>
> On Wed, 24 May 2023 at 12:00, Roger L Costello costello@xxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>     Hi Folks,
>
>     My input consist of lines of text. Here is a sample input:
>
>     1 Record Type
>
>     The XSLT program is to transform that line into this XML:
>
>     <column>1</column>
>     <field-name>Record Type</field-name>
>
>     Here is another sample input:
>
>     2 thru 4 Customer/Area Code
>
>     The XSLT program is to transform that line into this XML:
>
>     <column>2 thru 4</column>
>     <field-name>Customer/Area Code</field-name>
>
>     I figured that <xsl:analyze-string and regex-group() would be
>     suitable for breaking apart each line.
>
>     What regex to use for column? The value of column is an integer
>     followed optionally by "thru" and another integer. I figured this
>     regex should do the job:
>
>     ([0-9]+(\s+thru\s+[0-9]+)?)
>
>     But, but, but, ....
>
>     regex-group(1) means that whole regex. So with this input:
>
>     2 thru 4 Customer/Area Code
>
>     regex-group(1) matches:
>
>     2 thru 4
>
>     Perfect.
>
>     Unfortunately, regex-group(2) matches the inner, optional part. So
>     I end up with this:
>
>     <column>2 thru 4</column>
>     <field-name> thru 4</field-name>
>
>     Eek!
>
>     Wrong.
>
>     How to solve this problem? Is there a way to specify that the
>     inner, optional part:
>
>     (\s+thru\s+[0-9]+)?
>
>     belongs only to regex-group(1), not to regex-group(2)?
>
>     Stated another way, is there a way to indicate that the
>     parentheses in the inner, optional part are not to be considered
>     as regex-group() syntax?
>
>     Stated still another way, is there a way to "escape" the normal
>     interpretation of parentheses when utilizing regex-group()?
>
>
> B You could use non capturing group but no real need here, you haven't
> shown any regex matchingB  the second column, something like
>
> ^\s*([0-9]+(\s+thru\s+[0-9]+)?)\s*(.*)$
>
> then your columns are groups 1 and 3


Are they? I think you would need

<field-name>{(regex-group(3), regex-group(2))[1]}</field-name>

as no regex-group is created if no match occurs.

Current Thread