|
Subject: Re: [xsl] String contains a regex and then junk ... how to remove the junk? From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 16 Dec 2024 13:41:25 -0000 |
A good case for Invisible XML, though sadly we don't have it integrated into
Saxon yet.
The first step here is finding a matching closing paren. The second step is
dealing with backslash-escaped parens.
For the first step, I would use xsl:iterate iterating over the characters of
the string (in 4.0 use the fn:characters function, in 3.0 use
string-to-codepoints). Maintain a variable $depth over the iteration,
increment it on a left paren, decrement it on a right paren, break the
iteration when the depth reaches zero.
Then handling backslashes is just an extra bit of logic: in your xsl:iterate,
define a second variable that indicates whether the immediately preceding
character is a backslash (or rather, an unescaped backslash) and avoid
recognizing parens if it is.
Michael Kay
Saxonica
> On 16 Dec 2024, at 13:24, Roger L Costello costello@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi Folks,
>
> I want to convert this:
>
> <REG_EXP>(([\W\w]{1,80})?) <INFO></REG_EXP>
>
> to this:
>
> <REG_EXP>(([\W\w]{1,80})?)</REG_EXP>
>
> Convert this:
>
> <REG_EXP>([A-Z]{2}[0-9A-Z ]{0,13}) <ARF ID></REG_EXP>
>
> to this:
>
> <REG_EXP>([A-Z]{2}[0-9A-Z ]{0,13})</REG_EXP>
>
> I want to remove the junk that follows the regex.
>
> I wrote a recursive function to do this. See below. Is there is a simpler
way to do it?
>
> -------------------------------------
> <xsl:function name="f:get-regex">
> <xsl:param name="string"/>
> <xsl:choose>
> <xsl:when test="substring($string,1,1) ne '('">
> <xsl:message>Error! Expecting the regex to start with left
paren</xsl:message>
> </xsl:when>
> <xsl:otherwise>
> <xsl:value-of
select="concat('(',f:get-regex-helper($string,2,1))"/>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:function>
>
> <xsl:function name="f:get-regex-helper">
> <xsl:param name="string"/>
> <xsl:param name="index"/>
> <xsl:param name="count-left-parens-to-match"/>
> <xsl:choose>
> <xsl:when test="$count-left-parens-to-match eq 0">
> <xsl:value-of select="substring($string,1,$index - 1)"/>
> </xsl:when>
> <xsl:when test="substring($string,$index,1) eq ')'">
> <xsl:value-of
select="f:get-regex-helper($string,$index+1,$count-left-parens-to-match -
1)"/>
> </xsl:when>
> <xsl:otherwise>
> <xsl:value-of
select="f:get-regex-helper($string,$index+1,$count-left-parens-to-match)"/>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:function>
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| [xsl] String contains a regex and t, Roger L Costello cos | Thread | Re: [xsl] String contains a regex a, Norm Tovey-Walsh ndw |
| [xsl] String contains a regex and t, Roger L Costello cos | Date | Re: [xsl] String contains a regex a, Norm Tovey-Walsh ndw |
| Month |