Re: [xsl] How to escape the normal interpretation of parentheses when utilizing regex-group()?

Subject: Re: [xsl] How to escape the normal interpretation of parentheses when utilizing regex-group()?
From: "Bauman, Syd s.bauman@xxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 24 May 2023 13:05:08 -0000
DC>  something like
DC> ^\s*([0-9]+(\s+thru\s+[0-9]+)?)\s*(.*)$

I can report that not only is something like this what I would do, it is what
I have done, although my case was mildly different (e.g., the \s+thru\s+ was
just an en dash). But note that (at least in my case where the 2nd desired
output field was always present) the whitespace separator immediately before
the last group should probably be required, not optional. (Given that the
match in front of it is only digits, it is not going to really to matter. But
it makes it clearer what the regex is doing, IMHO, especially when either the
3rd group might start with digits or the 1st group might have roman numerals
or some other stuff that looks more like the 3rd group.) Also may be useful to
trim off trailing whitespace. Thus
  ^\s*([0-9]+(\s+thru\s+[0-9]+)?)\s+(.+)\s*$
and pick off the 1st group for <column> and the 3rd group for <field-name>.
(Although in truth, IIRC I did not bother with the trim trailing whitespace
\s* at the end, and passed the matched fields through normalize-space()
instead  I think I did not entirely trust the 3rd group to be consistent
about whitespace. But thats not really germane for the overall methodology of
nesting group 2 in group 1, and parsing off groups 1 & 3.)

________________________________

Current Thread