Subject: Re: [xsl] Replace the portion of text that matches pattern: XPath versus SNOBOL From: "Roger L Costello costello@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sun, 23 Mar 2025 10:23:50 -0000 |
Thank you, Martin, and Liam for your excellent feedback. I updated my writeup to incorporate Martin and Liam's feedback. See below. If you see any errors in the writeup, please let me know. -------------------------------------------------------- Sometimes you want to search for occurrences of a pattern in text and replace those substrings that match the pattern. For example, suppose WORD holds the text, "BAT" and we want to search for the pattern 'A' and replace it with 'E.' In XPath use the replace() function. The following XPath looks for the pattern 'A' in WORD and replaces the matched 'A' with 'E': replace($WORD, 'A', 'E') <-- returns "BET" Important point: WORD is unchanged. Its value is still "BAT." Herebs how replace is done in SNOBOL: WORD 'A' = 'E' Important point: WORD is changed. The value of WORD is now "BET." Suppose the text contains more than one substring that matches the pattern, and we want to replace all matching occurrences. For example, suppose WORD holds the text, "BALANCE" and we want to search for the pattern 'A' and replace each match with 'E.' By default, the XPath replace() function replaces all occurrences: replace($WORD,' A', 'E') <-- returns "BELENCE" To repeat, WORD is unchanged. The value of WORD is still "BALANCE." SNOBOL only replaces the first occurrence. To replace all occurrences, do a loop: LOOP WORD 'A' = 'E' :S(LOOP) The replacement statement is labelled with LOOP. The part on the right-side means, bIf the replacement succeeds (S means Succeeds), then goto LOOP.b In other words, repeatedly perform the replacement until there are no more matches. To repeat, WORD is changed. Now the value of WORD is "BELENCE." Suppose you want XPath to replace only the first match, i.e., we want "BELANCE." There is a clever way to configure the arguments of the XPath replace() function to bReplace only the first match.b Herebs how: Recall that the replace() function has 3 arguments: replace( WORD, pattern, replacement ) Instead of using 'A' for the second argument, make the second argument this: Everything preceding the first 'A' plus 'A.' For example, if the value of WORD is "BALANCE," then the second argument is "BA". However, that needs one tweak: put parentheses around the string that precedes the first 'A.' Thus, if the value of WORD is "BALANCE," then the second argument is "(B)A". The reason for the parentheses is that we can reference its content using $1. The third argument is the replacement string. We want to replace, in WORD, the string preceding the first 'A'--which $1 denotes--plus 'A' with $1 and 'E,' i.e., we want to replace $1 'A' with $1 'E,' or for our example, replace "BA" with "BE." Phew, that is complicated. Herebs the code: replace($WORD, concat('(^.*?)', 'A'), concat('$1','E')) Recap: - In XPath, use replace( WORD, pattern, replacement ) - In SNOBOL, use WORD pattern = replacement - The XPath replace() function does not change WORD - In SNOBOL, the value of WORD is changed - The result of evaluating the XPath replace() function is a string that is equal to the value of WORD except occurrences of substrings that match pattern have been replaced with replacement - The result of evaluating the SNOBOL statement is an indication of whether pattern succeeded or failed to match a substring of WORD - In XPath, by default the replace() function replaces all occurrences of pattern with replacement. However, there is a clever way to configure the arguments of the replace() function so that only the first occurrence of pattern is replaced with replacement - In SNOBOL only the first occurrence is replaced. Use a loop to replace all occurrences Lesson Learned; When designing a new programming language, consider the options for pattern matching replacement; some interesting options are illustrated by comparing XPath and SNOBOL.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Replace the portion of te, Liam R. E. Quin liam | Thread | Re: [xsl] Replace the portion of te, David Carlisle d.p.c |
Re: [xsl] Replace the portion of te, Liam R. E. Quin liam | Date | Re: [xsl] Replace the portion of te, David Carlisle d.p.c |
Month |