Subject: Re: [xsl] Re: A question about the expressive power and limitations of XPath 2.0 From: David Carlisle <davidc@xxxxxxxxx> Date: Sun, 13 Jan 2002 12:54:29 GMT |
> so you can't check that the name in the end tag is the same as the > name in the start tag) Examples: Jeni, If you mean here that you can't tell which start tag corresponds to which end tag, that is not a deficiency in the currently specified regular expression syntax, it's a statement that the language you are trying to accept is not regular. I think that there are three separate problems that might be addressed: 1) defficiencies in the regular expression syntax/semantics. This may or may not include lack of ^ and $ to match start and end of expression or perl style {2} repeat clauses. (Mainly it's hard to know what's there now as the text is a bit underspecified, hence my "overlapping regexp" question) 2) Possibilities for doing tree generation as well as string generation once the match is found. (Note this is purely an XSLT construction issue it doesn't affect the languages you accept, only what you can do with them). This is where I came in with the regexp matching template mechanism, and you've extended in various ways with named subexpression possibilities. 3) possibilities for accepting non regular languages in input strings. three examples given so far in this thread, nested {} pairs, html nested elements tag syntax, the classic non regular example of a string consisting of a and b with as many a as b. Here you've suggested moving away from regular expressions to languages specified explicitly by giving grammars, in the style of lex. I'd hoped (but haven't been able to cleanly spec so far) to stay with just adding regexp functionality even in this case, as that is (often) enough to tokenise the input string, and to control the extra state information required to parse the tokens using existing XSLT control constructs. Like Dimitre it's around a decade agao that I last thought about this stuff for real and the precise definitions between all the various classes of language can get very technical (and I can't remember them:-), but the differences, especially between regular and non regular languages can be important as it makes precise which kinds of language can be acepted by each system and which languages can not be accepted just by tinkering with the control syntax and require a different parsing technique altogether. > Creating a regular expression that matches start and end tags in > content ( So here for example you can create regexps that match start tags and end tags separately, you can even create a regexp that will match a start tag to its matching end tag so long as there are no more than 50 nested subelements, but you can't do the general case with _any_ syntax for regular expressions ('cause if you extend the syntax enough to do this it isn't regular any more) David _____________________________________________________________________ This message has been checked for all known viruses by Star Internet delivered through the MessageLabs Virus Scanning Service. For further information visit http://www.star.net.uk/stats.asp or alternatively call Star Internet for details on the Virus Scanning Service. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Re: A question about the , Jeni Tennison | Thread | Re: [xsl] Re: A question about the , Jeni Tennison |
Re: [xsl] [XSL] [XSL:FO] Arabic no, tanz | Date | [xsl] Matching data types, Jeni Tennison |
Month |