Re: [xsl] Replace the portion of text that matches pattern: XPath versus SNOBOL

Subject: Re: [xsl] Replace the portion of text that matches pattern: XPath versus SNOBOL
From: "Roger L Costello costello@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 23 Mar 2025 10:23:50 -0000
Thank you, Martin, and Liam for your excellent feedback.

I updated my writeup to incorporate Martin and Liam's feedback. See below. If
you see any errors in the writeup, please let me know.
--------------------------------------------------------
Sometimes you want to search for occurrences of a pattern in text and replace
those substrings that match the pattern. For example, suppose WORD holds the
text, "BAT" and we want to search for the pattern 'A' and replace it with 'E.'
In XPath use the replace() function. The following XPath looks for the pattern
'A' in WORD and replaces the matched 'A' with 'E':

replace($WORD, 'A', 'E')  <-- returns "BET"

Important point: WORD is unchanged. Its value is still "BAT."

Herebs how replace is done in SNOBOL:

WORD 'A' = 'E'

Important point: WORD is changed. The value of WORD is now "BET."

Suppose the text contains more than one substring that matches the pattern,
and we want to replace all matching occurrences. For example, suppose WORD
holds the text, "BALANCE" and we want to search for the pattern 'A' and
replace each match with 'E.' By default, the XPath replace() function replaces
all occurrences:

replace($WORD,' A', 'E')  <-- returns "BELENCE"

To repeat, WORD is unchanged. The value of WORD is still "BALANCE."

SNOBOL only replaces the first occurrence. To replace all occurrences, do a
loop:

LOOP WORD 'A' = 'E'			:S(LOOP)

The replacement statement is labelled with LOOP. The part on the right-side
means, bIf the replacement succeeds (S means Succeeds), then goto LOOP.b
In other words, repeatedly perform the replacement until there are no more
matches.

To repeat, WORD is changed. Now the value of WORD is "BELENCE."

Suppose you want XPath to replace only the first match, i.e., we want
"BELANCE." There is a clever way to configure the arguments of the XPath
replace() function to bReplace only the first match.b Herebs how: Recall
that the replace() function has 3 arguments:

replace( WORD, pattern, replacement )

Instead of using 'A' for the second argument, make the second argument this:
Everything preceding the first 'A' plus 'A.' For example, if the value of WORD
is "BALANCE," then the second argument is "BA".
However, that needs one tweak: put parentheses around the string that precedes
the first 'A.' Thus, if the value of WORD is "BALANCE," then the second
argument is "(B)A". The reason for the parentheses is that we can reference
its content using $1. The third argument is the replacement string. We want to
replace, in WORD, the string preceding the first 'A'--which $1 denotes--plus
'A' with $1 and 'E,' i.e., we want to replace $1 'A' with $1 'E,' or for our
example, replace "BA" with "BE." Phew, that is complicated. Herebs the code:

replace($WORD, concat('(^.*?)', 'A'), concat('$1','E'))

Recap:
-	In XPath, use replace( WORD, pattern, replacement )
-	In SNOBOL, use WORD pattern = replacement
-	The XPath replace() function does not change WORD
-	In SNOBOL, the value of WORD is changed
-	The result of evaluating the XPath replace() function is a string that is
equal to the value of WORD except occurrences of substrings that match pattern
have been replaced with replacement
-	The result of evaluating the SNOBOL statement is an indication of whether
pattern succeeded or failed to match a substring of WORD
-	In XPath, by default the replace() function replaces all occurrences of
pattern with replacement. However, there is a clever way to configure the
arguments of the replace() function so that only the first occurrence of
pattern is replaced with replacement
-	In SNOBOL only the first occurrence is replaced. Use a loop to replace all
occurrences

Lesson Learned; When designing a new programming language, consider the
options for pattern matching replacement; some interesting options are
illustrated by comparing XPath and SNOBOL.

Current Thread