Re: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)

Subject: Re: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Fri, 11 Jan 2002 14:59:44 +0000
Hi Mike,

>> If the latter, then perhaps documentless nodes are a blessing ;)
>
> Not if it means they have to be copied by physical cloning!

Sure :) I was thinking of documentless nodes that were never destined
to belong to a document - the kind of intermediate 'trees' that you
create for convenience en route to the final result tree. I guess it
doesn't make much of a difference if you have quite a complex tree
anyway, but if you're just generating 100 element nodes each with a
single text node, then it'd probably be nicer to keep them as a
sequence rather than adding them to a document node?

>> If the former, then it's a good argument for nested sequences, so
>> you don't have to create nodes to provide structure.
>
> Yes, there are some good arguments for nested sequences. But let's
> not go there, we want to get this thing finished.

While I empathise with the sentiment, I think it's a pretty dangerous
one. It will be very hard to upgrade to an XPath in the future that
*does* have nested sequences if XPath as it's currently designed goes
ahead. So there ought to be some pretty compelling reasons why it's
designed like that... and so far we haven't heard any.

> In the case of the regular expression functionality you are trying
> to define, I've been trying to follow the arguments but haven't
> reached any particular views on what the right answer is. I don't
> have much personal experience of languages that use regexps heavily,
> which doesn't help. It might be that a solution similar to
> xsl:for-each-group is needed. This was constrained by the fact that
> we couldn't model a set of groups directly in the data model, so
> instead we defined an instruction to iterate over the set of groups,
> presenting one group at a time to the application, and making that
> group available through the magic function current-group(). I sort
> of feel an xsl:for-each-string-match might work similarly, but I
> can't articulate the details yet. Keep working at it, guys.

Yes, that's what I was kinda thinking - current-match() (or a bunch of
similar functions like the ones from emacs that David posted
descriptions of), working in a similar way to current-group() to
return the results of the match. I suspect that it won't have any of
this fanciful stuff about nested subexpressions, for similarity with
other languages (but I still say break the mold!).

What I don't think we've thoroughly discussed yet is the idea of
regexp matching templates (as David first suggested) vs. regexp
matching instructions (which you need, I think, to cover the whole
spectrum of requirements). Hopefully David's coming up with some kind
of proposal that summarises it all ;)

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread