Re: [xsl] having a template remember not to call itself again

Subject: Re: [xsl] having a template remember not to call itself again
From: "Chris Papademetrious christopher.papademetrious@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 11 Mar 2023 16:08:39 -0000
Hi Gerrit,

Thanks for the link to Dr. Kay's proposal for checking parameters in template
matches! I replied to the comments there:

https://github.com/qt4cg/qtspecs/issues/108


Hi everyone,

I appreciate the numerous replies! As always, this list is quick to share its
grizzled and hard-earned wisdom. And I can completely relate to sometimes
needing to step back and rethink the approach, and for unexpected interactions
to be the warning flags that prompt the rethink.

This stylesheet is a general-purpose "DITA optimizer" that performs a variety
of things, such as

* Trimming whitespace in various nontrivial ways (recursively into
leading/trailing nested elements, skipping over invisible elements like
<indexterm>)
* Implementing best-practice structures
* Ungrouping unnecessary structures
* Moving <indexterm> elements to best-practice locations
* Inferring attributes (like @placement="break" on block-context images)
* Wrapping inline element runs when mixed within block elements
* Fixing character-encoding issues (cleaning up smart quotes in <pre>,
nonbreaking spaces, and non-hyphen characters)
* Fixing common capitalization issues in titles
* Converting <uicontrol>A > B > C</uicontrol> to <menucascade> structures
* ...and more...

There are currently about 40 different optimizations. I don't want to linearly
apply the optimizations in sequence because (1) there are many of them, and
(2) forcing an order might leave some optimizations unperformed (because
sometimes the application of one optimization makes another possible).

Some of the optimizations are more complex. Some internally use sub-modes
and/or <xsl:iterate> to perform their task atomically on subtree scopes
without interaction with other optimizations. (Joel - isn't it super
satisfying to push a blob of complex functionality into a single
xsl:iterate??) But from the stylesheet perspective, there are simply a bunch
of peer optimizations that should be applied wherever the opportunity exists
(i.e. their match expressions match).

And to me, this is one of the beautiful things about XSLT. I simply declare
all the optimizations that should be performed, and they are performed where
needed. My goal is to stay as close to this beauty as possible, while figuring
out the coding style needed to robustly deliver the functionality.

While converting optimizations from "head-call" chaining to "tail-call"
chaining, I ran into a significant limitation. Consider the following trimming
templates:

  <!-- remove leading whitespace, "tail-call" chaining -->
  <xsl:template match="p/text()[not(preceding-sibling::node())][matches(.,
'^\s+')]">
    <xsl:variable name="result" as="text()">
      <xsl:value-of select="replace(., '^\s+', '')"/>
    </xsl:variable>
    <xsl:apply-templates select="$result"/>  <!-- apply other templates, if
needed -->
  </xsl:template>

  <!-- remove trailing whitespace, "tail-call" chaining -->
  <xsl:template match="p/text()[not(following-sibling::node())][matches(.,
'\s+$')]">
    <xsl:variable name="result" as="text()">
      <xsl:value-of select="replace(., '\s+$', '')"/>
    </xsl:variable>
    <xsl:apply-templates select="$result"/>  <!-- apply other templates, if
needed -->
  </xsl:template>

These no longer chain because the tail-call to <xsl:apply-templates> does not
include the ancestor/sibling context of the node. So, I restored my
optimizations back to their original "head-call" chaining:

  <!-- remove leading whitespace, "head-call" chaining -->
  <xsl:template match="p/text()[not(preceding-sibling::node())][matches(.,
'^\s+')]">
    <xsl:variable name="result" as="text()">
      <xsl:next-match/>  <!-- apply other templates, if needed -->
    </xsl:variable>
    <xsl:value-of select="replace($result, '^\s+', '')"/>
  </xsl:template>

  <!-- remove trailing whitespace, "head-call" chaining -->
  <xsl:template match="p/text()[not(following-sibling::node())][matches(.,
'\s+$')]">
    <xsl:variable name="result" as="text()">
      <xsl:value-of select="replace($results, '\s+$', '')"/>
    </xsl:variable>
    <xsl:apply-templates select="$result"/>  <!-- apply other templates, if
needed -->
  </xsl:template>

To help catch issues with unexpected results, I added an is="..." qualifier to
every variable in the stylesheet. And indeed, this turned up a few issues that
I was able to resolve with a predicate here and a conditional there. I'm
pretty happy with the outcome, and I'm sold on the value of using as="..."
everywhere.

This does lead me to a question. When as="..." is absent, the contents are
placed into an anonymous document node. Is there an as="..." expression that
explicitly requests an anonymous document node?

As a rainy-day exercise, I might reverse the order of all templates in the
stylesheet to swap last-match-wins outcomes, run it on our doc set (about 40k
files), then diff the results to see if anything changed. But the added
as="..." checks have put me in a comfortable place.

Thanks as always,

 - Chris

Current Thread