RE: Regular expression functions (Was: Re: [xsl] comments on Dece mber F&O draft)

Subject: RE: Regular expression functions (Was: Re: [xsl] comments on Dece mber F&O draft)
From: "Hunsberger, Peter" <Peter.Hunsberger@xxxxxxxxxx>
Date: Fri, 11 Jan 2002 09:08:57 -0600
Hi Marc,

>> If a "matcher" explicitly returns a tree structure you could view it as
>> sending the results to the output document.  Thus, wrapping it in
>> a variable
>> would allow you to manipulate the results in a natural  (XSL at
>> least) way:
>>
>> 	<xsl:variable name="gunk">
>> 	  	<xsl:apply-templates
>> select-regexp=":fancy-number:\w:fancy-number:" />
>> 	</xsl:variable>
>> 	<xsl:value-of select="$gunk[2]/*[2]"/>
>>
>> (The template might actually be doing a "match-regexp", but to keep
things
>> concise let's pretend the example is complete :-).
>>
>> This seems to minimize the need to invent new stuff? Adding a similar
form
>> of regexp qualifier/attribute to copy and copy-of would seem to handle
all
>> general cases, no?
>
> it does indeed, some questions/2nd thoughts
> in which context would this be usefull?
> on what inputstring would the regex be matched? the value of . (current
> node/list/string)? it's serialized version?

I guess one could slice that three ways: 1) match against the value of the
current node; 2) more usefully: as a selector on the entire sub tree of
nodes from the current context, by their values;  3) probably a little
strange to try and figure out: use the regexp to in some way select the
nodes by applying the regexp to some form of xpath representation.

Complete serialization would seem to make little sense since one wants to be
able to select nodes out of the mess?  But that's an assumption on my part,
it seems you want to be able to select substrings....

>
> as for the syntax, from a unique identifying view it should in the given
> example at least be naming the subregex used, so more like
> $gunk/fancy-number[2]/ or
> $gunk/*[2]/

Well, if it's selecting nodes by values, then one would still have the
original document nodes to sort through and no artificial naming is needed
(except you need to realize that one might have multiple copies of the
original nodes).  From what I understand this doesn't fit the current data
model: one would want sequences to contain the equivalent of RTF's?

>as for the nature of how matchers work, there was an earlier remark on
>regex's not suitably returning trees, and indeed, my finding is that they
>are rather returning tables in which the rows are counted by every match,
>and the columns are counted by every group within that match...

I guess that's partly the data model problem: tables are forests of skinny
trees...?  Forests aren't allowed but then again, nor are tables.

>dunnow if all of this makes sense,
>finding something that gives a natural feel to both regex and xslt savvy
>will not be easy.

I'm sort of barely able to keep up with it.  I comfortable with both regexp
and XSLT but I've never tried to marry the two.  I guess my vision of regexp
in XSLT is as a way of selecting nodes and not as a way of selecting
strings.  As such, I don't expect a way of  tracking or naming substrings.
If I want to manipulate substrings of the output document, then I'd expect
to do that after the XSLT was done it's work, using the serialized output?

Peter Hunsberger

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread