Subject: [xsl] Imagine that the semantics of concatenating two regex patterns was this From: "Roger L Costello costello@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sun, 9 Mar 2025 10:21:17 -0000 |
Hi Folks, Here is XPath that determines if the value of variable TEXT matches the pattern 'A' (that is, does 'A' occur anywhere within the value of TEXT): matches($TEXT, 'A') The result of evaluating the expression is true or false. String Oriented Symbolic Language (SNOBOL) is best known for its pattern-matching facilities, which are very elaborate and powerful. In fact, most of the SNOBOL language is composed of pattern-matching operations. With SNOBOL here's how to examine the value of TEXT to see if it contains the letter A: TEXT 'A' If the letter A occurs anywhere in the value of TEXT, the pattern match succeeds. Otherwise it fails. Here is an XPath pattern for vowels: 'A|E|I|O|U' Suppose a pattern is to be used in a number of different places in a program; we would like to define the pattern once. In XSLT we create a variable to hold the pattern and then use the variable in matches(): <xsl:variable name="VOWELS" select="'A|E|I|O|U'"/> <xsl:value-of select="matches($TEXT,$VOWELS)"/> In SNOBOL you assign a name to a pattern: VOWELS = 'A' | 'E' | 'I' | 'O' | 'U' Subsequently this pattern may be referred to by the name VOWEL as in this statement: TEXT VOWELS In SNOBOL patterns may be concatenated in the same way that strings are concatenated (juxtaposition). For example, the statement TEXT VOWELS 'T' Succeeds if a vowel is immediately followed by a T in TEXT, i.e., if TEXT contains one of AT, ER, IT, OT, or UT. SNOBOL's semantics of pattern concatenation is fascinating. Clearly, more than verbatim concatenation is happening because a verbatim concatenation of the patterns would yield: 'A' | 'E' | 'I' | 'O' | 'U' 'T' which is not correct. [I am guessing that a] SNOBOL compiler/interpreter implicitly places parentheses around the first pattern: ('A' | 'E' | 'I' | 'O' | 'U') 'T' The semantics of concatenating patterns in XPath is verbatim concatenation: 'A|E|I|O|U' || 'T' which yields the incorrect pattern 'A|E|I|O|UT' If we want XPath to have the SNOBOL semantics, then we must explicitly place parentheses around the first pattern: '(' || 'A|E|I|O|U' || ')' || 'T' which yields the correct pattern '(A|E|I|O|U)T' Lessons Learned: how a programming language defines the semantics of pattern concatenation can have a profound influence on programming style. Thus, if you are creating a new programming language, be aware that there are other ways to define the semantics of pattern concatenation-you might decide to define the semantics as XSLT/XPath does-verbatim concatenation-but alternatively, you might decide to define the semantics as SNOBOL does-parentheses implicitly wrap the first pattern. /Roger P.S. I got the information about SNOBOL from the wonderful book, "A SNOBOL4 Primer" by Ralph E. Griswold and Madge T. Griswold, pages 20-21. Some of the above sentences are excerpts from the book.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Saxon vulnerability, Liam R. E. Quin liam | Thread | [xsl] What VAST areas of knowledge , Roger L Costello cos |
Re: [xsl] Saxon vulnerability, Liam R. E. Quin liam | Date | [xsl] What VAST areas of knowledge , Roger L Costello cos |
Month |