Re: [xsl] Using node-set variables in predicates (another node comparison question)

Subject: Re: [xsl] Using node-set variables in predicates (another node comparison question)
From: "Chris Papademetrious christopher.papademetrious@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 3 Jan 2022 01:58:13 -0000
Hi Dimitre,

Just some feedback from a novice... For me, this would be difficult to
remember to determine if a node is in a sequence:

	exists(index-of($seq, $n, id-equal#2))

A one-word operator for this would be easier for me to remember:

	$n in $seq
	$n is $seq


Hi everyone (again),

I was able to use the [$n intersect $seq] trick again today! And I'm proud of
how it turned out, so I wanted to share it with you.

I want to remove leading/trailing whitespace from certain DITA block elements.
For example, I want to turn this:

	<p>   This is some text.</p>

into this:

	<p>This is some text.</p>

But there are two tricky aspects:

1. The leading/trailing whitespace could be buried in a lower-level inline
element:

	<p>   Here is some text.</p>
	<p><b>   Here</b> is some text.</p>
	<p><b><i>   Here</i></b> is some text.</p>

so I need to match the first effectively rendered descendant text() node of
these block elements.

2. Some DITA block elements allow other DITA block elements in them:

	<p>   This is a paragraph element.</p>
	<li>   This is a list element.</li>

	<li>
	  <p>   This is a paragraph element in a list element.</p>
	</li>

so I need the sibling-adjacency check to stop at the lowest-level enclosing
block element.

Here are the templates I came up with:


  <!-- look for leading/trailing text() nodes in these block elements -->
  <xsl:variable name="elements"
select="//(desc|dt|entry|glossterm|li|p|pre|shortdesc|title)"/>

  <!-- remove leading whitespace from leading text() nodes in block elements
-->
  <xsl:template match="text()
                       [matches(., '^\s+')]
                       [ancestor::*[. intersect $elements][not(descendant::*[.
intersect $elements])]]
                       [not(ancestor-or-self::node()
                         [ancestor::*[. intersect
$elements][not(descendant::*[. intersect $elements])]]
                         [preceding-sibling::node()]
                        )]">
    <xsl:variable name="results">
      <xsl:next-match/>  <!-- apply other templates, if needed -->
    </xsl:variable>
    <xsl:value-of select="replace($results, '^\s+', '')"/>
  </xsl:template>

  <!-- remove trailing whitespace from trailing text() nodes in block elements
-->
  <xsl:template match="text()
                       [matches(., '\s+$')]
                       [ancestor::*[. intersect $elements][not(descendant::*[.
intersect $elements])]]
                       [not(ancestor-or-self::node()
                         [ancestor::*[. intersect
$elements][not(descendant::*[. intersect $elements])]]
                         [following-sibling::node()]
                        )]">
    <xsl:variable name="results">
      <xsl:next-match/>  <!-- apply other templates, if needed -->
    </xsl:variable>
    <xsl:value-of select="replace($results, '\s+$', '')"/>
  </xsl:template>


Basically, it goes something like:

Find the text() node:

* That has leading/trailing whitespace
* That is within a block element that does not contain some other lower-level
block element
* That is not itself, or has no ancestor up to (but not including) that block
element, with a preceding/following sibling

The hardest part was figuring out how to get all ancestors up to the first
block element, but not past that. The nesting of [descendant::*[...]] within
[ancestor::*] is probably not the most performant way to do this, but it gets
the job done.

And by using <xsl:next-match/>, the templates can work together to remove both
leading and trailing whitespace from the same text() node, if needed.

 - Chris

Current Thread