Re: [xsl] How to escape the normal interpretation of parentheses when utilizing regex-group()?

Subject: Re: [xsl] How to escape the normal interpretation of parentheses when utilizing regex-group()?
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 24 May 2023 11:19:42 -0000
On 5/24/2023 1:00 PM, Roger L Costello costello@xxxxxxxxx wrote:
> Hi Folks,
>
> My input consist of lines of text. Here is a sample input:
>
> 1 Record Type
>
> The XSLT program is to transform that line into this XML:
>
> <column>1</column>
> <field-name>Record Type</field-name>
>
> Here is another sample input:
>
> 2 thru 4 Customer/Area Code
>
> The XSLT program is to transform that line into this XML:
>
> <column>2 thru 4</column>
> <field-name>Customer/Area Code</field-name>
>
> I figured that <xsl:analyze-string and regex-group() would be suitable for
breaking apart each line.
>
> What regex to use for column? The value of column is an integer followed
optionally by "thru" and another integer. I figured this regex should do the
job:
>
> ([0-9]+(\s+thru\s+[0-9]+)?)
>
> But, but, but, ....
>
> regex-group(1) means that whole regex. So with this input:
>
> 2 thru 4 Customer/Area Code
>
> regex-group(1) matches:
>
> 2 thru 4
>
> Perfect.
>
> Unfortunately, regex-group(2) matches the inner, optional part. So I end up
with this:
>
> <column>2 thru 4</column>
> <field-name> thru 4</field-name>
>
> Eek!
>
> Wrong.
>
> How to solve this problem? Is there a way to specify that the inner,
optional part:
>
> (\s+thru\s+[0-9]+)?
>
> belongs only to regex-group(1), not to regex-group(2)?
>
> Stated another way, is there a way to indicate that the parentheses in the
inner, optional part are not to be considered as regex-group() syntax?
>
> Stated still another way, is there a way to "escape" the normal
interpretation of parentheses when utilizing regex-group()?
>

I would, assuming XSLT 3, use analyze-string and process its result e.g.
assuming your data is in a "data" element


<xsl:template match="data">
<xsl:apply-templates select="analyze-string(.,
'([0-9]+(\s+thru\s+[0-9]+)?)\s*(.*$)')" mode="match"/>
</xsl:template>

<xsl:template match="*:group[@nr = 1]" mode="match">
<column>{.}</column>
</xsl:template>

<xsl:template match="*:group[last()]" mode="match">
<fieldname>{.}</fieldname>
</xsl:template>

Current Thread