Re: [xsl] Parsing XPath in XSLT?

Subject: Re: [xsl] Parsing XPath in XSLT?
From: "John Lumley john@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 26 Mar 2020 11:34:10 -0000
On 25/03/2020 17:31, Wendell Piez wapiez@xxxxxxxxxxxxxxx wrote:
> I am currently having to interpret XPath or (more likely) an XPath
> subset into an abstract representation that can be rewritten into
> various forms. Naturally I would like to do this out of a parse tree
> or the functional equivalent, represented in some sort of XML, since
> serializing that back out is easy enough. It is producing that tree
> that is a problem. I need a parser for XPath or if not for all of
> XPath, then at least for my subset -- which includes namespaces. So
> even if partial the model must expose names and namespaces to the
> extent that a path rewriter can (for example) map into a new set of
> namespace prefixes --
>
> Any thoughts? Open source projects I should take a look at? Have the
> community-standards initiatives captured any good work in this area?

The simplest is to use Gunther Rademacher's REx parser 
(https://www.bottlecaps.de/rex/) to generate an XSLT parser for 
XPath3.1. To do this download the sample grammar for XPath31 (left hand 
column on the page) then generate the parser using XSLT as the target, 
backtracking on, and parse tree checked on Generate Code. Hitting 
'Generate' should produce a download of a file xpath-31.xslt, which 
contains internally a function p:parse-XPath($expression as xs:string) 
as element() [xmlns:p="xpath-31"].

Evaluating this function produces the parse tree (assuming of course the 
syntax of the expression is correct), as an XML tree where the element 
names correspond to the recursive Grammar productions, with leaves of 
literals, names etc. and tokens. So for example, 1 to 5 parses as the 
deep tree:

     B <XPath>
     B B B B B B B B B B B  <Expr>
     B B B B B B B B B B B B B B  <ExprSingle>..........
     B B B  B B B  B B B  B B B  B B B  B B B  B B B  <RangeExpr> .......
    <IntegerLiteral>1</IntegerLiteral> .........
    <TOKEN>to</TOKEN>.........
    <IntegerLiteral>5</IntegerLiteral>
    </>

Namespace bearing terms such as charlie, bar:fred generate 
<QName>charlie</> <QName>bar:fred</> leaves, so all the information is 
still preserved.

The tree is easily manipulated with XSLT, and the inversion to valid 
XPath expression strings can be processed pretty simply, by something 
along the lines of:

 B B B  <xsl:mode name="parse2text" on-no-match="shallow-skip"/>
 B B B  <xsl:template match="TOKEN[. = ('to')]" mode="parse2text"> {.} 
</xsl:template>
 B B B  <xsl:template match="TOKEN[. = (',')]" mode="parse2text">{.} 
</xsl:template>
 B B B  <xsl:template match="TOKEN|Literal|QName" 
mode="parse2text">{.}</xsl:template>

Using this it is pretty simple to write a small stylesheet that processes:

    <samples xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
    xmlns:bar="BARBER" xmlns:charlie="CHARLIE" xmlns:delta="DELTA" >
     B B B  <remap from="xsl" to="charlie"/>
     B B B  <remap from="bar" to="delta"/>
     B B B  <remap from="delta" to="bar"/>
     B B B  <xpath>charlie, xsl:foo, $bar, xsl, $bar:fred</xpath>
     B B B  <xpath>map{'a': 1 to 5, $b : delta:X}</xpath>
    </samples>

and produces a result:

    <samples xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
     B B B B B B B B  xmlns:bar="BARBER"
     B B B B B B B B  xmlns:charlie="CHARLIE"
     B B B B B B B B  xmlns:delta="DELTA">
     B B  <remap from="xsl" to="charlie"/>
     B B  <remap from="bar" to="delta"/>
     B B  <remap from="delta" to="bar"/>
     B B  <xpath>
     B B B B B  <source>charlie, xsl:foo, $bar, xsl, $bar:fred</source>
     B B B B B  <textFromParse>charlie, xsl:foo, $bar, xsl,
    $bar:fred</textFromParse>
     B B B B B  <modified>charlie, charlie:foo, $bar, xsl, $delta:fred</modified>
     B B  </xpath>
     B B  <xpath>
     B B B B B  <source>map{'a': 1 to 5, $b : delta:X}</source>
     B B B B B  <textFromParse>map{'a': 1 to 5, $b: delta:X}</textFromParse>
     B B B B B  <modified>map{'a': 1 to 5, $b: bar:X}</modified>
     B B  </xpath>
    </samples>

(I've sent you the appropriate files privately, so you can run them 
yourself and look at the parse trees - works fine in Oxygen)

-- 
*John Lumley* MA PhD CEng FIEE
john@xxxxxxxxxxxx <mailto:john@xxxxxxxxxxxx>
on behalf of Saxonica Ltd

Current Thread