RE: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)

Subject: RE: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)
From: "Marc Portier" <mpo@xxxxxxxxxxxxxxxx>
Date: Sat, 12 Jan 2002 12:38:39 +0100
Hi Peter,

> >> If a "matcher" explicitly returns a tree structure you could view it as
> >> sending the results to the output document.  Thus, wrapping it in
> >> a variable
> >> would allow you to manipulate the results in a natural  (XSL at
> >> least) way:
> >>
> >> 	<xsl:variable name="gunk">
> >> 	  	<xsl:apply-templates
> >> select-regexp=":fancy-number:\w:fancy-number:" />
> >> 	</xsl:variable>
> >> 	<xsl:value-of select="$gunk[2]/*[2]"/>
> >>
> >> (The template might actually be doing a "match-regexp", but to keep
> things
> >> concise let's pretend the example is complete :-).
> >>
> >> This seems to minimize the need to invent new stuff? Adding a similar
> form
> >> of regexp qualifier/attribute to copy and copy-of would seem to handle
> all
> >> general cases, no?
> >
> > it does indeed, some questions/2nd thoughts
> > in which context would this be usefull?
> > on what inputstring would the regex be matched? the value of . (current
> > node/list/string)? it's serialized version?
>
> I guess one could slice that three ways: 1) match against the value of the
> current node; 2) more usefully: as a selector on the entire sub tree of
> nodes from the current context, by their values;  3) probably a little
> strange to try and figure out: use the regexp to in some way select the
> nodes by applying the regexp to some form of xpath representation.
>
> Complete serialization would seem to make little sense since one
> wants to be
> able to select nodes out of the mess?  But that's an assumption
> on my part,
> it seems you want to be able to select substrings....
>

the process was (derived from omnimark times) called 'uptransforming' as
adding markup into areas where it's missing or badly placed (like loads of
HTML, where it's cluthering what it's really about)... this often is easier
by regexing accross node boundaries (using your lingo: considering the whole
mess (including the tags) at once, rather then only the submess inside a
node :-))

other hand, I understand this kinda breaks with the more common xslt feeling
(where the incoming stream can't be more messy then xml allows :-))
have to admit the cross-node-boundary-thing would more often be seen as a
fancy anything (-that-can- be-called-text) to xml _parser_ then a needed
feature of a _transformer_ ...

still... allowing users to assemble (never mind how) a markup-containing
string they want to parse into a nodeset to handle in the xslt engine
probably _is_ considered as a sensible feature, in which case having a
kindof regex-sheet-driven _parser_ that can be asked from the xslt process
to return it's nodes wouldn't be so different.

-marc=


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread