Re: [xsl] For-each-group group-starting-with drops text between inline elements

Subject: Re: [xsl] For-each-group group-starting-with drops text between inline elements
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 3 Sep 2020 15:27:38 -0000
Am 03.09.2020 um 17:20 schrieb Terry Ofner tdofner@xxxxxxxxx:
I have a document with the following structure:
<div>
<pclass="passage">
<spanclass="itemNum">(1)</span>First <b>sentence</b> of the passage.
<spanclass="itemNum">(2)</span> Second sentence of the passage.
<spanclass="itemNum">(3)</span> Third sentence of the passage.
</p>
</div>

I need to chunk this into separte items:

<divclass="passage_set">
<pclass="item"itemNum="(1)"><b>(1)</b> First <b>sentence</b> of the
passage.</p>
<pclass="item"itemNum="(2)"><b>(2)</b> Second sentence of the passage.</p>
<pclass="item"itemNum="(3)"><b>(3)</b> Third sentence of the passage.</p>
</div>

     If there were no nodes in the text between spans, I could use
tokenize, which I do on such occasions.
     With sets such as the one above, I have been trying to use
for-each-group. But I am unable to capture the text between the span
elements.
     Here is the relevant section of my current stylesheet (3.0 Saxon-PE
9.8.0.12):

<xsl:variablename="passage_raw">
<divclass="passage_set">

<xsl:for-each-groupselect="div/p[@class='passage']/*"group-starting-with="spa
n">

To include text nodes or any nodes in the grouping population you need
to use /node() instead of /* in the path for the select.

The only issue might be the text node before the first span, perhaps
using /node()[normalize-space()] is better or inside of the
for-each-group you need to check whether you have a "real" group
starting with a "span" or just collected leading text.

Current Thread