Re: [xsl] xslt 2.0 regex

Subject: Re: [xsl] xslt 2.0 regex
From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx>
Date: Sat, 17 Mar 2012 18:56:18 +0100
Using '|' for alternatives of single character patterns is
unnecessary. Omitting a few ranges and the anchor '$',
you can write, for instance:
<xsl:variable name="NameStartChar.re" as="xs:string">
[A-Z:_a-z&#xC0;-&#xD6;&#xD8;-&#xF6;&#xF8;-&#x2FF;&#x370;-&#x37D;.....]
</xsl:variable>

It reduces the danger of failing due to operator precedence. Then the
final RE becomes

    ^[...][...]*$

-W

On 17 March 2012 18:38, Brandon Ibach <brandon.ibach@xxxxxxxxxxxxxxxxxxx>
wrote:
>
> Both XSLT 1.0 and 2.0 define variable names as QNames, so they *can*
> have a ":" in them.
>
> Ignoring that for the moment, though, Tony pointed out one consequence
> of this, but the bigger issue is that the "|" operator in regex is
> fairly low precedence, so you often need some parenthesis around the
> list of alternatives to get things right.  Your NameStartChar.re has a
> hex-char-ref-encoded "$" at the beginning, so that regex is actually
> "$[A-Z] | _ | [a-z] | ...", which means it will match "(a dollar sign
> followed by an upper-case English letter) or (an underscore) or (a
> lower-case English letter) or ...".
>
> Actually, it might not even match that, since the "$" is a special
> character in regex, so you should escape it with a backslash to match
> it literally (though I think there are some rules which allow it to
> match literally even without the backslash, depending on what follows
> it, but best to be explicit).
>
> All that said, I got this to work by dropping the "$" from the start
> of NameStartChar.re and changing Name.re to:
>
> concat("\$(", $NameStartChar.re, ")(", $NameChar.re,")*")
>
> -Brandon :)
>
>
> On Sat, Mar 17, 2012 at 12:43 PM, davep <davep@xxxxxxxxxxxxx> wrote:
> > On 17/03/12 16:29, Tony Graham wrote:
> >>
> >> On Sat, March 17, 2012 4:14 pm, davep wrote:
> >> ...
> >>>
> >>> It's still not working
> >>>
> >>> <xsl:variable name="NameStartChar.re" as="xs:string">
> >>>   &#x024;[A-Z]|_|[a-z] |
> >>>   [&#xC0;-&#xD6;] | [&#xD8;-&#xF6;] |
> >>>   [&#xF8;-&#x2FF;] | [&#x370;-&#x37D;] |
> >>>   [&#x37F;-&#x1FFF;] | [&#x200C;-&#x200D;] |
> >>>   [&#x2070;-&#x218F;] | [&#x2C00;-&#x2FEF;] |
> >>>   [&#x3001;-&#xD7FF;] | [&#xF900;-&#xFDCF;] |
> >>>   [&#xFDF0;-&#xFFFD;] | [&#x10000;-&#xEFFFF;]
> >>> </xsl:variable>
> >>>
> >>> <xsl:variable name="NameChar.re"  as="xs:string"
> >>>              select="concat($NameStartChar.re,' |
> >>> - | \. | [0-9] |&#xB7; | [&#x0300;-&#x036F;] |
> >>> [&#x203F;-&#x2040;]')"/>
> >>>
> >>>
> >>> <xsl:variable name='Name.re'
> >>>              select='concat($NameStartChar.re,
> >>> "(", $NameChar.re,")*")'/>
> >>
> >>
> >> Why not use '\i' and '\c' from
> >> http://www.w3.org/TR/xmlschema-2/#charcter-classes?
> >
> >
> > For which range please Tony? Err....
> >
> > \i includes : which is wrong?
> > \c looks good though! Ah no. Again it's NameChar from
> > http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-NameChar
> > which is more than allowed for xsl:variable @name?
> >
> >
> >>
> >> Otherwise, you may want '(' and ')' around $NameStartChar.re in
> >> $Name.re,
> >> otherwise (to mix variable expansions) it looks like
> >> '...|[&#x203F;-&#x2040;]($NameChar.re)*" and you'll only match
> >> multi-character names when they begin with a character in the range
> >> [&#x203F;-&#x2040;].
> >
> >
> > As I read it (or more accurately fail to read it correctly)
> > It's NameChar less :
> > followed by (Name less :)+
> >
> > Simpler version [A-Za-z0-9]+ and the i18N additions,
> > but I can't get the simpler one working.
> >
> >
> >
> >
> >
> > It's only matching on the first letter of a variable currently....
> >
> > <xsl:variable select="$fred"/>
> >
> > Produces
> > "xsl:variable"
> >            [f]
> > "xsl:variable"
> >            [r]
> >  "xsl:variable"
> >            [e]
> >  "xsl:variable"
> >            [d]
> >
> > so something is seriously wrong.
> >
> >
> >
> >>
> >> Regards,
> >>
> >>
> >> Tony Graham                                   tgraham@xxxxxxxxxx
> >> Consultant                                 http://www.mentea.net
> >> Mentea       13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
> >>  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
> >>     XML, XSL-FO and XSLT consulting, training and programming
> >>
> >>
> >
> >
> >
> > regards
> >
> > --
> > Dave Pawson
> > XSLT XSL-FO FAQ.
> > http://www.dpawson.co.uk

Current Thread