Subject: Re: [xsl] How to escape the normal interpretation of parentheses when utilizing regex-group()? From: "David Carlisle d.p.carlisle@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 24 May 2023 11:49:08 -0000 |
On Wed, 24 May 2023 at 12:30, Martin Honnen martin.honnen@xxxxxx < xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > On 5/24/2023 1:21 PM, David Carlisle d.p.carlisle@xxxxxxxxx wrote: > > > > On Wed, 24 May 2023 at 12:00, Roger L Costello costello@xxxxxxxxx < > xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> Hi Folks, >> >> My input consist of lines of text. Here is a sample input: >> >> 1 Record Type >> >> The XSLT program is to transform that line into this XML: >> >> <column>1</column> >> <field-name>Record Type</field-name> >> >> Here is another sample input: >> >> 2 thru 4 Customer/Area Code >> >> The XSLT program is to transform that line into this XML: >> >> <column>2 thru 4</column> >> <field-name>Customer/Area Code</field-name> >> >> I figured that <xsl:analyze-string and regex-group() would be suitable >> for breaking apart each line. >> >> What regex to use for column? The value of column is an integer followed >> optionally by "thru" and another integer. I figured this regex should do >> the job: >> >> ([0-9]+(\s+thru\s+[0-9]+)?) >> >> But, but, but, .... >> >> regex-group(1) means that whole regex. So with this input: >> >> 2 thru 4 Customer/Area Code >> >> regex-group(1) matches: >> >> 2 thru 4 >> >> Perfect. >> >> Unfortunately, regex-group(2) matches the inner, optional part. So I end >> up with this: >> >> <column>2 thru 4</column> >> <field-name> thru 4</field-name> >> >> Eek! >> >> Wrong. >> >> How to solve this problem? Is there a way to specify that the inner, >> optional part: >> >> (\s+thru\s+[0-9]+)? >> >> belongs only to regex-group(1), not to regex-group(2)? >> >> Stated another way, is there a way to indicate that the parentheses in >> the inner, optional part are not to be considered as regex-group() syntax? >> >> Stated still another way, is there a way to "escape" the normal >> interpretation of parentheses when utilizing regex-group()? >> > > You could use non capturing group but no real need here, you haven't > shown any regex matching the second column, something like > > ^\s*([0-9]+(\s+thru\s+[0-9]+)?)\s*(.*)$ > > then your columns are groups 1 and 3 > > > Are they? > yes:-) <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0"> <xsl:template name="m"> <xsl:variable name="x"> 1 aaa 2 thru 4 bbb </xsl:variable> <xsl:analyze-string select="$x" regex="^\s*([0-9]+(\s+thru\s+[0-9]+)?)\s*(.*)$" flags="m"> <xsl:matching-substring> <xsl:text> </xsl:text> <col1><xsl:value-of select="regex-group(1)"/></col1> <col2><xsl:value-of select="regex-group(3)"/></col2> <xsl:text> </xsl:text> </xsl:matching-substring> </xsl:analyze-string> </xsl:template> </xsl:stylesheet> $ saxon9 -it:m rg.xsl <?xml version="1.0" encoding="UTF-8"?> <col1>1</col1><col2>aaa</col2> <col1>2 thru 4</col1><col2>bbb</col2> > I think you would need > > <field-name>{(regex-group(3), regex-group(2))[1]}</field-name> > > as no regex-group is created if no match occurs. > XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list> > EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/2739265> (by > email <>)
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] How to escape the normal , Martin Honnen martin | Thread | Re: [xsl] How to escape the normal , Martin Honnen martin |
Re: [xsl] How to escape the normal , Martin Honnen martin | Date | Re: [xsl] How to escape the normal , Martin Honnen martin |
Month |