Subject: RE: [xsl] lookaheads in XSLT2 regexes From: Liam R E Quin <liam@xxxxxx> Date: Wed, 03 Mar 2010 16:09:38 -0500 |
On Tue, 2010-03-02 at 09:21 +0000, Michael Kay wrote: > I would imagine there would also be raised eyebrows about including "_" in > the set of "word" characters. That's something that only happens in geekdom. > But in the past the principle has been "if Perl defines it well, do what > Perl does, otherwise leave it out completely." In my view we've already > copied too many of Perl's mistakes, like the strange rules on recognizing > whether \12 is a back-reference to group 12 or a back-reference to group 1 > followed by a digit 2. I don't remember what first introduced back-references beyond 9; it might have been sed. More recently Perl provides named capture buffers, instead of having to use numbers, and also \g to get the back references -- \g{12} \g{-1} # the last buffer and with (?<sock> ....pattern.... ) ..... \g{sock} .net and perl regexps are incompatible in what happens if you mix the (...) and \1 with named buffers -- Perl counts both named and unnamed buffers, and .net only counts unnamed ones. On the subject of \b I'll note we do have \W and \w -- Perl at least defines \b as a boundary between \W and \w. It _is_ crazy that \b in a character class represents backspace. Perl also has \B to match at a non-word boundary -- between \w and \w or between \W and \W. Historically, the Unix vi editor used (uses) \< for matching \W\w (i.e. the start of a "word") and \> for the end, \w\W, which always seemed a little clearer to me, but for use with XML we need to stay away from assigning meaning to < and > I think :-) Liam -- Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/ Ankh: irc.sorcery.net irc.gnome.org www.advogato.org
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] lookaheads in XSLT2 regex, Michael Ludwig | Thread | RE: [xsl] lookaheads in XSLT2 regex, Michael Kay |
Re: [xsl] creating elements from se, Michael Müller-Hille | Date | RE: [xsl] lookaheads in XSLT2 regex, Michael Kay |
Month |