Re: [xsl] Split into numbered files: without side-effect? (XSLT 2)

Subject: Re: [xsl] Split into numbered files: without side-effect? (XSLT 2)
From: Yves Forkl <Y.Forkl@xxxxxx>
Date: Fri, 28 Sep 2007 15:32:13 +0200
David Carlisle wrote:

XSLt has no way of evaluating a string as an XPath (in princiiple an xslt compiler may have parsed everything into internal form so the runtime system need not have any Xpath parser to hand at all. people
cooming from interpretted languages (or even compiled ones0 find
this strange, but it's no different from say, C, where you can't just
pass in a string of C syntax at run time and expect a compiled c
program to know whet to do with it unless it is linked to a C
compiler.

That makes perfect sense.



Very often though (eg sorting tables) you just need to specify an element name, not a full path and that can be done with

select="*[name()=$name]"

[etc.]

So if youi want to specify arbitrary paths [...] the way of doing it without an extension is to present teh xpaths as part of a stylesheet
that you import, so instead of a configuration file [...] You use

> <xsl:function name="my:path">
> <xsl:param name="node" as="node()"/>
> <xsl:param name="name" as="xs:string"/>
> <xsl:choose>
> <xsl:when test="$name='a'"><xsl:sequence select="$node/a/b/c"/></xsl:when>
> <xsl:when test="$name='b'"><xsl:sequence select="$node/a/b/d"/></xsl:when>
> [...]


From looking at the code that I actually use in my current project (as opposed to the simplified version I showed in my original posting), I must conclude that I seem to have developed something quite similar. :-) See below.


Michael Kay wrote:


> If you've got XPath expressions stored in an XML document, then the
> only way to evaluate them with standard XSLT is to generate a
> stylesheet and then execute it.

That's exactly what I'm doing in some other project. Meta-stylesheets
are a wonderful thing!

Dimitre Novatchev replied to Michael Kay:

> It is not the only way.
>
> Someone (like me) may be able to parse  an XPath expression using pure
> XSLT (say for example with the FXSL LR Parsing Framework). Of course,
> parsing is only half the way. The parsed expression need be
> interpreted and this needs the context to be specified as well.

Having not even an infinitesimal amount of the mathematical understanding that Dimitre has, I for myself am sure that I will never be able to achieve exactly the same thing. This makes me wonder even more which aspects of such a general device I might have covered with the following, which is a solution to my original problem of assigning nodes to chunks.

Like David, I am using a stylesheet function to assure matching the XPath paths of my nodes onto the paths of the chunk starting elements, yet these are not hard-wired but read from the config file, and I compare paths based on Regexes. So I have essentially these fragments:


<xsl:param name="sorted_list_of_chunk_element_paths" as="node()*"> <xsl:for-each select="document('chunk_starting_elements.xml') /chunks/element"> <xsl:sort select="count(tokenize(., '/'))" order="descending"/> <xsl:sequence select="."/> </xsl:for-each> </xsl:param>

  <xsl:function name="my:quote-for-regex">
    <xsl:param name="input" as="xs:string"/>
    <xsl:sequence
      select="replace($input,'[\\\|\.\-\^\?\*\+\(\)\{\}\[\]\$]','\\$0')"/>
  </xsl:function>

  <xsl:function name="my:xpath-of-element-node" as="xs:string">
    <xsl:param name="element" as="node()"/>
    <xsl:value-of select="$element/ancestor-or-self::*/name()"
      separator="/"/>
  </xsl:function>

  <xsl:function name="my:chunk-starter-path" as="node()*">
    <xsl:param name="context-element" as="node()"/>
    <xsl:variable name="matching-path-nodes" as="node()*">
      <xsl:for-each select="$sorted-list-of-chunk-element-paths">
        <!-- anchor match at the beginning to avoid matching
             substrings of QNames -->
        <xsl:if
          test="matches(concat('#root/',
                            my:xpath-of-element-node($context-element)),
                        concat('^#root/([^/]+/)*',
                               my:quote-for-regex(.), '$'))">
          <xsl:sequence select="."/>
        </xsl:if>
      </xsl:for-each>
    </xsl:variable>
    <!-- select only the most specific path if several are found -->
    <xsl:sequence select="if (empty($matching-path-nodes))
                          then ()
                          else $matching-path-nodes[1]"/>
  </xsl:function>

<xsl:variable name="chunk-number" saxon:assignable="yes" select="0"/>

  <xsl:template match="*">
    <xsl:if test="my:chunk-starter-path(.)">
      <saxon:assign name="chunk-number" select="$chunk-number + 1"/>
      <xsl:result-document
          href="{concat('chunk_', format-number($chunk-number, '00'),
          '.txt')}">
        <xsl:apply-templates/>
      </xsl:result-document>
    </xsl:if>
    <xsl:apply-templates/>
  </xsl:template>


Is this comparable to what David proposed? Could it be a step in the direction of Dimitre's general XPath parser (the "parsing" of the XPath expressions above is very rudimentary, of course) and comparing machine?


And, any ideas how I could further improve the code above?

Yves

Current Thread