RE: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)

Subject: RE: Regular expression functions (Was: Re: [xsl] comments on December F&O draft)
From: "Marc Portier" <mpo@xxxxxxxxxxxxxxxx>
Date: Thu, 10 Jan 2002 02:43:45 +0100

> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of
> naha@xxxxxxxxxx
> Sent: maandag 7 januari 2002 22:05
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx; David Carlisle
> Cc: www-xml-query-comments@xxxxxx; Jeni Tennison
> Subject: Re: Regular expression functions (Was: Re: [xsl] comments on
> December F&O draft)
>
>
> Quoting David Carlisle <davidc@xxxxxxxxx>:
> [...]
> > Looking at why one needs regexp in an XML query language, it is
> usually
> > to infer structure into otherwise unstructured (by XML) input.
>
> In general regular expressions are not sufficiently powerful for this,
> being one level too low in the Chomsky hierarchy.
>
> I'm not saying that regular expressions arn't useful in many cases,
> merely that they are not sufficiently powerful to describe how to
> transform (i.e. parse) a piece of text into a tree.
>

(while backthracking this thread... it's euh, not easy :-) )
I guess our regexslt (see other postings in this thread) succeeds (well, it
works :-)) by cheating then:

we place our regexes in a tree first... by placing the regex in a tree I
have the feeling we avoid the problem of letting the regex construct the
tree...


one more remark...
first vision I had in attacking the
augment-text-with-markup-in-some-usefull-way-challenge was looking into
parser-generators (writing some extension or pre-processor (like jtb) to
javacc was the earthly thought I had in fact, the generated parser would
then e.g. talk to a sax ContentHandler)

Apart from the fact that it looked like going to be so much work I got tired
before even starting on it (due to lack of experience with javacc and the
like)...
I guess the end-result would also have been more of a burden on end-users,
no?  Is it just me, or would it indeed be safe to guess that in usage
regexes are more common and widespread then writing formal syntaxes?

In every case, the javacc idea is not dead, maybe someone with more
experience in that area could pick up and nail down why it would never work?
My feeling is however that it would be just the right thing to build up
trees from


> [...]
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread