|
Subject: [xsl] Imagine that the semantics of concatenating two regex patterns was this From: "Roger L Costello costello@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sun, 9 Mar 2025 10:21:17 -0000 |
Hi Folks,
Here is XPath that determines if the value of variable TEXT matches the
pattern 'A' (that is, does 'A' occur anywhere within the value of TEXT):
matches($TEXT, 'A')
The result of evaluating the expression is true or false.
String Oriented Symbolic Language (SNOBOL) is best known for its
pattern-matching facilities, which are very elaborate and powerful. In fact,
most of the SNOBOL language is composed of pattern-matching operations. With
SNOBOL here's how to examine the value of TEXT to see if it contains the
letter A:
TEXT 'A'
If the letter A occurs anywhere in the value of TEXT, the pattern match
succeeds. Otherwise it fails.
Here is an XPath pattern for vowels: 'A|E|I|O|U'
Suppose a pattern is to be used in a number of different places in a program;
we would like to define the pattern once. In XSLT we create a variable to hold
the pattern and then use the variable in matches():
<xsl:variable name="VOWELS" select="'A|E|I|O|U'"/>
<xsl:value-of select="matches($TEXT,$VOWELS)"/>
In SNOBOL you assign a name to a pattern:
VOWELS = 'A' | 'E' | 'I' | 'O' | 'U'
Subsequently this pattern may be referred to by the name VOWEL as in this
statement:
TEXT VOWELS
In SNOBOL patterns may be concatenated in the same way that strings are
concatenated (juxtaposition). For example, the statement
TEXT VOWELS 'T'
Succeeds if a vowel is immediately followed by a T in TEXT, i.e., if TEXT
contains one of AT, ER, IT, OT, or UT.
SNOBOL's semantics of pattern concatenation is fascinating. Clearly, more than
verbatim concatenation is happening because a verbatim concatenation of the
patterns would yield:
'A' | 'E' | 'I' | 'O' | 'U' 'T'
which is not correct. [I am guessing that a] SNOBOL compiler/interpreter
implicitly places parentheses around the first pattern:
('A' | 'E' | 'I' | 'O' | 'U') 'T'
The semantics of concatenating patterns in XPath is verbatim concatenation:
'A|E|I|O|U' || 'T' which yields the incorrect pattern 'A|E|I|O|UT'
If we want XPath to have the SNOBOL semantics, then we must explicitly place
parentheses around the first pattern:
'(' || 'A|E|I|O|U' || ')' || 'T' which yields the correct pattern
'(A|E|I|O|U)T'
Lessons Learned: how a programming language defines the semantics of pattern
concatenation can have a profound influence on programming style. Thus, if you
are creating a new programming language, be aware that there are other ways to
define the semantics of pattern concatenation-you might decide to define the
semantics as XSLT/XPath does-verbatim concatenation-but alternatively, you
might decide to define the semantics as SNOBOL does-parentheses implicitly
wrap the first pattern.
/Roger
P.S. I got the information about SNOBOL from the wonderful book, "A SNOBOL4
Primer" by Ralph E. Griswold and Madge T. Griswold, pages 20-21. Some of the
above sentences are excerpts from the book.
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Saxon vulnerability, Liam R. E. Quin liam | Thread | [xsl] What VAST areas of knowledge , Roger L Costello cos |
| Re: [xsl] Saxon vulnerability, Liam R. E. Quin liam | Date | [xsl] What VAST areas of knowledge , Roger L Costello cos |
| Month |