Re: [xsl] two regexp related questions

Subject: Re: [xsl] two regexp related questions
From: Julian Reschke <julian.reschke@xxxxxx>
Date: Thu, 19 May 2011 23:47:24 +0200
On 2011-05-19 23:36, Michael Kay wrote:
On 19/05/2011 18:45, Julian Reschke wrote:
Hi there,

I've got two regexp-related questions.

1) Is it correct that XSLT/XPath2's regular expressions do not support
non-capturing groups (as shown in
<http://www.exampledepot.com/egs/java.util.regex/NoGroup.html>)?

Yes. This is being added in 3.0.

2) With respect to analyze-string, and the captured regex-groups:


I'm using a regex like

([A-Z]+) = ([A-Z]+) ( ; ([A-Z]+) = ([A-Z]+) )*

for matching things like

a=b;c=d;e=f


Why not use the regex


([A-Z]+) = ([A-Z]+)

and rely on xsl:matching-substring being called multiple times?

My bad. I simplified the example too much; the first segment has a (slightly) different syntax.


Alternatively, first use tokenize() to split on the semicolons, and then
put each token through xsl:analyze-string.

See above; the separator character can occur in quoted values; I simplified too much.


Personally, I find long regexes very difficult to get right, and try to
split the task up into small pieces that I can understand and debug.

Indeed.


I'm now assembling them piece by piece from smaller sub-expressions, each of which matching a part of the grammar I'm trying to parse.

Thanks for the feedback, Julian

Current Thread