Re: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)

Subject: Re: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Fri, 11 Jan 2002 10:44:09 +0000
Hi Marc,

> you mean: the *[index] is throwing all named subregexes on one array
> and getting the second regardless it's name, right?


> getting an actual parenthesis group out of a named subregex would be
> different, no?

I don't think it has to be, if you use elements with some standard
name to represent them...

Say you had:

<regex name="fancy-number">[0-9]+(\.[0-9]+)?([Ee][+-][0-9]+)?</regex>
<matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:">

And you were matching the string:

  "12.5 3.4E-2"

I was imagining that you'd get built a tree that looked like
(formatted for clarity - the only whitespace would actually be a
single space between the two fancy-number elements):


Where rxp is associated with some namespace like (for XPath anyway):

So the values of the nodes selected by the following paths would be:

  /                        =>  ("12.5 3.4E-2")
  /fancy-number            =>  ("12.5", "3.4E-2")
  /fancy-number[1]         =>  ("12.5")
  /fancy-number[1]/node()  =>  ("12", ".5")
  /fancy-number[1]/text()  =>  ("12")
  /fancy-number[1]/*[1]    =>  (".5")
  /fancy-number[1]/*[2]    =>  ()
  /fancy-number[2]         =>  ("3.4E-2")
  /fancy-number[2]/*       =>  (".4", "E-2")

If you have named subexpressions within a named subexpression, that
just changes the name of the element created for that subexpression.
So if you had:

<regex name="mantissa">[0-9]+(\.[0-9]+)?</regex>
<regex name="exponent">[Ee][+-][0-9]+</regex>
<regex name="fancy-number">:mantissa::exponent:?</regex>
<matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:">

Matching the same string would give you a tree like:


I should note that nothing existing in XPath or XSLT automatically
creates a tree in this way. However, several EXSLT functions do (as a
means of returning 'sequences', in fact!). I suspect that the
introduction of user-defined functions in XSLT will lead to more
functions that do this, but don't know whether people would feel it
was acceptable for a built-in function.


Jeni Tennison

 XSL-List info and archive:

Current Thread