Subject: Re: [xsl] String contains a regex and then junk ... how to remove the junk? From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 16 Dec 2024 13:41:25 -0000 |
A good case for Invisible XML, though sadly we don't have it integrated into Saxon yet. The first step here is finding a matching closing paren. The second step is dealing with backslash-escaped parens. For the first step, I would use xsl:iterate iterating over the characters of the string (in 4.0 use the fn:characters function, in 3.0 use string-to-codepoints). Maintain a variable $depth over the iteration, increment it on a left paren, decrement it on a right paren, break the iteration when the depth reaches zero. Then handling backslashes is just an extra bit of logic: in your xsl:iterate, define a second variable that indicates whether the immediately preceding character is a backslash (or rather, an unescaped backslash) and avoid recognizing parens if it is. Michael Kay Saxonica > On 16 Dec 2024, at 13:24, Roger L Costello costello@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi Folks, > > I want to convert this: > > <REG_EXP>(([\W\w]{1,80})?) <INFO></REG_EXP> > > to this: > > <REG_EXP>(([\W\w]{1,80})?)</REG_EXP> > > Convert this: > > <REG_EXP>([A-Z]{2}[0-9A-Z ]{0,13}) <ARF ID></REG_EXP> > > to this: > > <REG_EXP>([A-Z]{2}[0-9A-Z ]{0,13})</REG_EXP> > > I want to remove the junk that follows the regex. > > I wrote a recursive function to do this. See below. Is there is a simpler way to do it? > > ------------------------------------- > <xsl:function name="f:get-regex"> > <xsl:param name="string"/> > <xsl:choose> > <xsl:when test="substring($string,1,1) ne '('"> > <xsl:message>Error! Expecting the regex to start with left paren</xsl:message> > </xsl:when> > <xsl:otherwise> > <xsl:value-of select="concat('(',f:get-regex-helper($string,2,1))"/> > </xsl:otherwise> > </xsl:choose> > </xsl:function> > > <xsl:function name="f:get-regex-helper"> > <xsl:param name="string"/> > <xsl:param name="index"/> > <xsl:param name="count-left-parens-to-match"/> > <xsl:choose> > <xsl:when test="$count-left-parens-to-match eq 0"> > <xsl:value-of select="substring($string,1,$index - 1)"/> > </xsl:when> > <xsl:when test="substring($string,$index,1) eq ')'"> > <xsl:value-of select="f:get-regex-helper($string,$index+1,$count-left-parens-to-match - 1)"/> > </xsl:when> > <xsl:otherwise> > <xsl:value-of select="f:get-regex-helper($string,$index+1,$count-left-parens-to-match)"/> > </xsl:otherwise> > </xsl:choose> > </xsl:function>
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] String contains a regex and t, Roger L Costello cos | Thread | Re: [xsl] String contains a regex a, Norm Tovey-Walsh ndw |
[xsl] String contains a regex and t, Roger L Costello cos | Date | Re: [xsl] String contains a regex a, Norm Tovey-Walsh ndw |
Month |