Re: [xsl] Splitting a paragraph into sentences and keep markup

Subject: Re: [xsl] Splitting a paragraph into sentences and keep markup
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 24 Nov 2019 14:03:50 -0000
Am 24.11.2019 um 14:34 schrieb Rick Quatro rick@xxxxxxxxxxxxxx:
>
> Hi All,
>
> I have a situation where I want to split a short paragraph into
> sentences and use them in different parts of my output. I am using
> <xsl:analyze-string> because I want to account for a sentence ending
> with a . or ?. This will work except if there are any children of the
> paragaph, like the <emphasis> in the second sentence. Can I split a
> paragraph into sentences and still keep the markup?
>

I think one approach in cases like this is to use two processing steps,
for instance with different modes, where you in the first step insert an
element (e.g. <eos/>) to mark up the end of a sentence and in the second
step use for-each-group group-ending-with="eos".

See https://xsltfiddle.liberty-development.net/gWEamLm which does


   <xsl:mode on-no-match="shallow-copy"/>

   <xsl:mode name="insert-marker" on-no-match="shallow-copy"/>

   <xsl:template match="p//text()" mode="insert-marker">
       <xsl:analyze-string select="." regex="[\.\?](\s+|$)">
           <xsl:matching-substring>
               <xsl:value-of select="."/>
               <eos/>
           </xsl:matching-substring>
           <xsl:non-matching-substring>
               <xsl:value-of select="."/>
           </xsl:non-matching-substring>
       </xsl:analyze-string>
   </xsl:template>

   <xsl:template match="eos"/>

   <xsl:template match="p">
       <xsl:variable name="p-with-markers" as="element(p)">
           <xsl:apply-templates select="." mode="insert-marker"/>
       </xsl:variable>
       <xsl:copy>
           <xsl:apply-templates select="@*"/>
           <xsl:for-each-group select="$p-with-markers/node()"
group-ending-with="eos">
               <xsl:choose>
                   <xsl:when test="current-group()[last()][self::eos]">
                       <sentence>
                           <xsl:apply-templates select="current-group()"/>
                       </sentence>
                   </xsl:when>
                   <xsl:otherwise>
                       <xsl:apply-templates select="current-group()"/>
                   </xsl:otherwise>
               </xsl:choose>
           </xsl:for-each-group>
       </xsl:copy>
   </xsl:template>


I haven't quite understood the final step where you want to create a
"note" but perhaps the above helps. It is XSLT 3 but in XSLT 2 you
simply need to spell out the xsl:mode declarations used as an identity
template.

Current Thread