Subject: RE: Regular expression functions (Was: Re: [xsl] comments on December F&O draft) From: "Marc Portier" <mpo@xxxxxxxxxxxxxxxx> Date: Fri, 11 Jan 2002 00:49:00 +0100 |
Hi Jeni, > -----Original Message----- > From: Jeni Tennison [mailto:jeni@xxxxxxxxxxxxxxxx] > Sent: donderdag 10 januari 2002 14:05 > To: Marc Portier > Cc: Steven Noels; xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: Re: Regular expression functions (Was: Re: [xsl] comments on > December F&O draft) > > > Hi Marc, > > > some > > <regex name="fancy-number">[0-9]+(\.[0-9]+)?([Ee][+-][0-9]+)?</regex> > > > > could then later be used inside > > <matcher name="" regex="(other groups):fancy-number:(other groups)"> > > ... while nested matchers or output-selecting elements could > then use group > > selections like > > 1. <... select-group="1"> ... or 2 refering to counting > the parenthesis in > > the scoped regex of this matcher > > 2. <... select-group=":fancy-number:2" > > > </matcher> > > > > could be challenging to implement (spontanous idea of using the > > indexes as offsets in counting parenthesis) > > I like this method better than the Omnimark method of assigning the > names within the regular expression itself, because it doesn't clutter > the regular expression (if anything it makes it more readable) and it > allows regular expressions to be reused. > jep > There are a couple of issues that would need to be worked out with it, > though. What happens if you have a regular expression that involved > two instances of the named subexpression at the same level: > > <matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:"> > ... > </matcher> > > You need to have separate indexes to indicate which one you're talking > about, plus some kind of syntax to pull out submatches within the > named subexpression. Borrowing from XPath syntax (which might be a bad > idea), you might have: > > fancy-number[2]/*[2] jep, had short internet-time juste before I left with sending this reply, it crossed my mind later, that indeed double reuse of one regex inside another one could occur, nice to see there is already a syntax inside the world of xslt-awares that would help out. > > to indicate the second subexpression of the second fancy-number > subexpression in the matched string. > trying to catch it completely though: you mean: the *[index] is throwing all named subregexes on one array and getting the second regardless it's name, right? getting an actual parenthesis group out of a named subregex would be different, no? example of the nuance I'm seeing: how would I select the exponent-group out of the second matched fancy-number in the folowing setting? no sub-subregex's only parenthesis groups <regex name="fancy-number">[0-9]+(\.[0-9]+)?([Ee][+-][0-9]+)?</regex> <matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:"> ... select-group="fancy-number[2]/2" ... </matcher> compared to: <regex name="exponent">[Ee][+-][0-9]+</regex> <regex name="fractalpart">\.[0-9]+</regex> <regex name="fancy-number">[0-9]+:fractalpart:?:exponent:?</regex> <matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:"> ... select-group="fancy-number[2]/*[2]" or select-group="fancy-number[2]/exponent" ... </matcher> > Actually, that syntax isn't all that bad - you can imagine the matcher > actually builds up a tree structure based on the subexpression yep, need some more imagination before actually building it though :-) > matches (you need 'anonymous' elements for unnamed subexpressions, but > you should be able to get away with that using elements in some > restricted namespace or something)... mmm... don't understand how we could get unnamed subexpressions? as far as I see now, we'ld need :name: to slice them in, no? > > > this also makes me think about your earlier mentioning of dynamic > > regexes you probably expect anything that qualifies as a > > text-representing xsl parameter to be possibly carrying part of the > > regex to be executed... > > I think that if you could build the named regular expressions > dynamically, then this idea would work fine. Going back to the keyword > example that I used on an earlier mail, you could do: > > <xsl:regexp name="keyword-as-word" > select="concat('\W', $keyword, '\W')" /> > > If named regular expressions were like variables, you could assign > them values at the global or local level... > thx > Cheers, > > Jeni > > --- > Jeni Tennison > http://www.jenitennison.com/ > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: Regular expression functions (W, Jeni Tennison | Thread | Re: Regular expression functions (W, Jeni Tennison |
Re: [xsl] xsl:variable, David Carlisle | Date | RE: Regular expression functions (W, Marc Portier |
Month |