Re: [xsl] xslt 2.0 regex

Subject: Re: [xsl] xslt 2.0 regex
From: Brandon Ibach <brandon.ibach@xxxxxxxxxxxxxxxxxxx>
Date: Sat, 17 Mar 2012 13:38:25 -0400
Both XSLT 1.0 and 2.0 define variable names as QNames, so they *can*
have a ":" in them.

Ignoring that for the moment, though, Tony pointed out one consequence
of this, but the bigger issue is that the "|" operator in regex is
fairly low precedence, so you often need some parenthesis around the
list of alternatives to get things right.  Your NameStartChar.re has a
hex-char-ref-encoded "$" at the beginning, so that regex is actually
"$[A-Z] | _ | [a-z] | ...", which means it will match "(a dollar sign
followed by an upper-case English letter) or (an underscore) or (a
lower-case English letter) or ...".

Actually, it might not even match that, since the "$" is a special
character in regex, so you should escape it with a backslash to match
it literally (though I think there are some rules which allow it to
match literally even without the backslash, depending on what follows
it, but best to be explicit).

All that said, I got this to work by dropping the "$" from the start
of NameStartChar.re and changing Name.re to:

concat("\$(", $NameStartChar.re, ")(", $NameChar.re,")*")

-Brandon :)


On Sat, Mar 17, 2012 at 12:43 PM, davep <davep@xxxxxxxxxxxxx> wrote:
> On 17/03/12 16:29, Tony Graham wrote:
>>
>> On Sat, March 17, 2012 4:14 pm, davep wrote:
>> ...
>>>
>>> It's still not working
>>>
>>> <xsl:variable name="NameStartChar.re" as="xs:string">
>>>   &#x024;[A-Z]|_|[a-z] |
>>>   [&#xC0;-&#xD6;] | [&#xD8;-&#xF6;] |
>>>   [&#xF8;-&#x2FF;] | [&#x370;-&#x37D;] |
>>>   [&#x37F;-&#x1FFF;] | [&#x200C;-&#x200D;] |
>>>   [&#x2070;-&#x218F;] | [&#x2C00;-&#x2FEF;] |
>>>   [&#x3001;-&#xD7FF;] | [&#xF900;-&#xFDCF;] |
>>>   [&#xFDF0;-&#xFFFD;] | [&#x10000;-&#xEFFFF;]
>>> </xsl:variable>
>>>
>>> <xsl:variable name="NameChar.re"  as="xs:string"
>>>              select="concat($NameStartChar.re,' |
>>> - | \. | [0-9] |&#xB7; | [&#x0300;-&#x036F;] |
>>> [&#x203F;-&#x2040;]')"/>
>>>
>>>
>>> <xsl:variable name='Name.re'
>>>              select='concat($NameStartChar.re,
>>> "(", $NameChar.re,")*")'/>
>>
>>
>> Why not use '\i' and '\c' from
>> http://www.w3.org/TR/xmlschema-2/#charcter-classes?
>
>
> For which range please Tony? Err....
>
> \i includes : which is wrong?
> \c looks good though! Ah no. Again it's NameChar from
> http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-NameChar
> which is more than allowed for xsl:variable @name?
>
>
>>
>> Otherwise, you may want '(' and ')' around $NameStartChar.re in $Name.re,
>> otherwise (to mix variable expansions) it looks like
>> '...|[&#x203F;-&#x2040;]($NameChar.re)*" and you'll only match
>> multi-character names when they begin with a character in the range
>> [&#x203F;-&#x2040;].
>
>
> As I read it (or more accurately fail to read it correctly)
> It's NameChar less :
> followed by (Name less :)+
>
> Simpler version [A-Za-z0-9]+ and the i18N additions,
> but I can't get the simpler one working.
>
>
>
>
>
> It's only matching on the first letter of a variable currently....
>
> <xsl:variable select="$fred"/>
>
> Produces
> "xsl:variable"
>            [f]
> "xsl:variable"
>            [r]
>  "xsl:variable"
>            [e]
>  "xsl:variable"
>            [d]
>
> so something is seriously wrong.
>
>
>
>>
>> Regards,
>>
>>
>> Tony Graham                                   tgraham@xxxxxxxxxx
>> Consultant                                 http://www.mentea.net
>> Mentea       13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
>>  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
>>     XML, XSL-FO and XSLT consulting, training and programming
>>
>>
>
>
>
> regards
>
> --
> Dave Pawson
> XSLT XSL-FO FAQ.
> http://www.dpawson.co.uk

Current Thread