RE: [xsl] regex grouping precedence.

Subject: RE: [xsl] regex grouping precedence.
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 29 Sep 2004 09:19:41 +0100
The groups are numbered by counting left brackets: the 5th unescaped left
bracket starts group 5, regardless of where the closing brackets or are, and
regardless of other operators such as "|".

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: Pawson, David [mailto:David.Pawson@xxxxxxxxxxx] 
> Sent: 29 September 2004 08:23
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] regex grouping precedence.
> 
> http://www.w3.org/TR/xslt20/#element-matching-substring seems to say
> little about how nested grouping is numbered.
> 
> (...) (....) ( ....)
> gives regex-group (1,2,3) OK.
> 
> (...) (....)| ( ....)
>   Is the third group counted as two due to the alternates?
> or still 3?
> 
> (...) ( (..)(.).) ( ....)
> 1     2 3   4     5
> is how I would expect to number them,
> but I'm totally unsure.
> 
> Using xslt 2 to parse a plain text file;
> 
> Input string
> 500748,500748              ,Set My People Free  
> 
> regex
> 
>  <xsl:for-each select='tokenize($f, "[\r]?\n")'>
>     <r> 
>   
>          <xsl:analyze-string  flags="ix"
>          regex="([0-9]{{6}})
>          (,,)|(,([0-9]{{6}})\p{{Zs}}+,(.*))$" 
>           select=".">
>     <xsl:matching-substring>
>       <bibno><xsl:value-of select="regex-group(1)"/></bibno>
>       <ck><xsl:value-of 
> select="normalize-space(regex-group(4))"/></ck>
>       <ttl><xsl:value-of 
> select="normalize-space(regex-group(5))"/></ttl>
>     </xsl:matching-substring>
>     <xsl:non-matching-substring>
>           <n><xsl:value-of select="."/></n>
>     </xsl:non-matching-substring>
>   </xsl:analyze-string>
>     
>    
> </r>
> </xsl:for-each>
>    
> 
> output is 
> 
>       <n>500748</n>
>       <bibno/>
>       <ck>500748</ck>
>       <ttl>Set My People Free</ttl>
> 
> 
> Regards DaveP.
> 
> **** snip here *****
> 
> -- 
> DISCLAIMER:
> 
> NOTICE: The information contained in this email and any 
> attachments is 
> confidential and may be privileged.  If you are not the intended 
> recipient you should not use, disclose, distribute or copy any of the 
> content of it or of any attachment; you are requested to notify the 
> sender immediately of your receipt of the email and then to delete it 
> and any attachments from your system.
> 
> RNIB endeavours to ensure that emails and any attachments generated by
> its staff are free from viruses or other contaminants.  However, it 
> cannot accept any responsibility for any  such which are transmitted.
> We therefore recommend you scan all attachments.
> 
> Please note that the statements and views expressed in this email and 
> any attachments are those of the author and do not 
> necessarily represent
> those of RNIB.
> 
> RNIB Registered Charity Number: 226227
> 
> Website: http://www.rnib.org.uk

Current Thread