Subject: Re: [xsl] Special characters in regex expression From: "Wolfgang Laun wolfgang.laun@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 24 Jul 2014 04:46:53 -0000 |
On 23/07/2014, Michael Dykman mdykman@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > It is my understanding that Java' regular expression builtin emulates > 'pcre' pretty closely. Perl 5 has, over time, added some rather unique features that aren't available with Java. XPath is a subset of Java's regex. > > To escape spacial characters that have special meaning in a regular > expression, defining it as a character class (using the square bracket > notation) generally works > > ie. if you want to match a question mark at the beginning of a line, > use: "^[?].*$" Thus, regex="(\.|\!|\?)(?!\)|\.|\d|\w)" (ignoring the lack of look-ahead) were better rewritten as regex="[.!?](?![).\d\w])" <!-- not valid --> It is possible to select groups within the matching substring: regex="([.!?])([^).\d\w])" Thus, in this simple case it is possible to use regex-group(1) and regex-group(2) to get the two characters individually, and insert nodes as required. I am not sure what Gabor expects to happen with, e.g., "...??..." or "...!!...", which are matched by this regex. -W > > On Wed, Jul 23, 2014 at 3:55 PM, mike@xxxxxxxxxxxx > <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >> Exclamation mark is not a special character in XPath regular expressions, >> and there does not need to be (and must not be) escaped. >> >> Negative lookaheads are not supported in the XPath regular expression >> dialect. >> >> You can't assume that all regular expression dialects are the same. >> >> Michael Kay >> >> Saxonica >> >> >> >>> Dear All, >>> >>> I am using xsl:analyze-string to retrieve and replace punctuation, >>> however, I got the following error: >>> >>> Error in regular expression: net.sf.saxon.trans.XPathException: Syntax >>> error at char 6 in regular expression: Escape character '!' not allowed. >>> >>> How should I escape and match '?' and '!' ? I am also using a negative >>> look-ahead, why isn't that working? >>> >>> Here is a sample from my code, thanks, >>> >>> Gabor >>> >>> >>> <xsl:template match="//TEI:p//text()[ not >>> ((parent::TEI:note)|(parent::TEI:hi)|(parent::TEI:date))]"> >>> <xsl:analyze-string select="." regex="(\.|\!|\?)(?!\)|\.|\d|\w)"> >>> >>> <xsl:matching-substring> >>> >>> <xsl:element name="seg" >>> namespace="http://www.tei-c.org/ns/1.0"><xsl:value-of >>> select="."/></xsl:element> >>> </xsl:matching-substring> >>> <xsl:non-matching-substring> >>> <xsl:value-of select="."/> >>> </xsl:non-matching-substring> >>> </xsl:analyze-string> >>> >> >> XSL-List info and archive >> EasyUnsubscribe (by email) > > > > -- > - michael dykman > - mdykman@xxxxxxxxx > > May the Source be with you.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Special characters in reg, Michael Dykman mdykm | Thread | [xsl] The Amsterdam Plot - How recu, Dimitre Novatchev dn |
[xsl] The Amsterdam Plot - How recu, Dimitre Novatchev dn | Date | [xsl] How to stream-process non-XML, Costello, Roger L. c |
Month |