Subject: [xsl] Splitting a paragraph into sentences and keep markup From: "Rick Quatro rick@xxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sun, 24 Nov 2019 13:34:26 -0000 |
Hi All, I have a situation where I want to split a short paragraph into sentences and use them in different parts of my output. I am using <xsl:analyze-string> because I want to account for a sentence ending with a . or ?. This will work except if there are any children of the paragaph, like the <emphasis> in the second sentence. Can I split a paragraph into sentences and still keep the markup? Here is my input document: <?xml version="1.0" encoding="UTF-8"?> <root> <p>This has one sentence? Actually, it has <emphasis>two</emphasis>. No, it has three.</p> </root> My stylesheet: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:rq="http://www.frameexpert.com" exclude-result-prefixes="xs rq" version="2.0"> <xsl:output indent="yes"/> <xsl:strip-space elements="root"/> <xsl:template match="/root"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="p"> <xsl:variable name="sentences" select="rq:splitParagraphIntoSentences(.)"/> <p><xsl:value-of select="$sentences[1]"/></p> <note>Something in between.</note> <p><xsl:value-of select="$sentences[position()>1]"/></p> </xsl:template> <xsl:function name="rq:splitParagraphIntoSentences"> <xsl:param name="paragraph"/> <xsl:analyze-string select="$paragraph" regex=".+?[\.\?](\s+|$)"> <xsl:matching-substring> <sentence><xsl:value-of select="replace(.,'\s+$','')"/></sentence> </xsl:matching-substring> </xsl:analyze-string> </xsl:function> </xsl:stylesheet> My output: <?xml version="1.0" encoding="UTF-8"?> <root> <p>This has one sentence?</p> <note>Something in between.</note> <p>Actually, it has two. No, it has three.</p> </root> What I want is this: <?xml version="1.0" encoding="UTF-8"?> <root> <p>This has one sentence? </p> <note>Something in between.</note> <p>Actually, it has <emphasis>two</emphasis>. No, it has three. </p> </root> Any suggestions will be appreciated. Rick
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Can Watch XSLT Training d, Liam R. E. Quin liam | Thread | Re: [xsl] Splitting a paragraph int, Martin Honnen martin |
Re: [xsl] Best practice for typing?, Imsieke, Gerrit, le- | Date | Re: [xsl] Splitting a paragraph int, Martin Honnen martin |
Month |