Subject: Re: [xsl] Support for lookaround regexp in XSLT -- any time soon? From: James Fuller <james.fuller.2007@xxxxxxxxx> Date: Fri, 1 Mar 2013 18:24:46 +0100 |
thx for the background info, its useful and interesting to hear about. btw http://en.wikipedia.org/wiki/Regular_expression does a good job at identifying regex specs/docs but I would argue that perl6 https://github.com/perl6/specs/blob/master/S05-regex.pod does the best job at unambiguously defining ... though this regex is not your grandad's regex. back to regex in XML land ... to add a datapoint: I think the only oft repeated shortcoming of regex in XML, is lack of lookahead/lookbehind J On Fri, Mar 1, 2013 at 11:40 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote: > > >> unsure about the original reason to restrict regex, as it seems to > > just confuse people when a regex they lovingly crafted elsewhere > doesn't work (not that the various java, Perl, etc schisms help). > > I don't know the history in full, but I think there were several reasons XSD > adopted a "minimal" regex subset: > > (a) they wanted to be sure it could be widely implemented using existing > regex engines (i.e. a highest common factor approach) > > (b) they wanted to exclude anything that didn't make sense in an > international Unicode context (so things like word boundaries were > immediately suspect) > > (c) they wanted to make sure that what they included was well specified. > Finding solid specifications of regex constructs is remarkably difficult; > there's a culture of very informal specification. Many times when adding > constructs to the XPath spec, we've had to do empirical tests on existing > regex engines such as PCRE to see how they actually handle edge cases, and > very often we find differences between different engines that couldn't be > guessed from the documentation. For example, there's a sorry history of > patches to the spec regarding the handling of a newline character appearing > as the last thing in the input. It's a shame when a feature gets left out > because we can't decide what it should do in edge cases, but the standards > process tends to lead to people asking such questions and expecting answers. > > Michael Kay > Saxonica
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Support for lookaround re, Michael Kay | Thread | Re: [xsl] Support for lookaround re, Michael Kay |
Re: [xsl] Support for lookaround re, Michael Kay | Date | Re: [xsl] Support for lookaround re, Michael Kay |
Month |