Subject: Re: [xsl] segmenting a paragraph From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx> Date: Tue, 02 Oct 2007 10:34:59 +0200 |
In trying to solve the following problem I am seeking your help:
I want to segment paragraphs in a text, so that sentences are enclosed in a <s> element and within the sentences, words between interpunction are within <seg> elements.
So far, I have been capturing the content of <p> in a string and then using two nested <xsl:analyze-string> blocks with regexes, which work nicely and do what I want. Now I discovered that there are <note> elements with additional markup in some paragraphs, which get lost in this process. However, I really want to leave these notes alone, as they are. So:
<p>Some text. Some more text, with a comma. <note>This stuff, how boring</note></p>
should look like:
<p><s><seg>Some text.</seg></s><s><seg>Some more text,</seg><seg> with a comma.</seg></s><note>This stuff, how boring</note></p>
I wonder how I tell the processor to leave the note stuff alone?
<xsl:template match="p"> <xsl:analyze-string select="." ..... </xsl:template>
<xsl:template match="p"> <xsl:apply-templates mode="in-p" select="node()"/> </xsl:template> <xsl:template mode="in-p" match="*"> <xsl:apply-templates select="."/> <!--reapply in the default mode--> </xsl:template> <xsl:template mode="in-p" match="text()"> <xsl:analyze-string select="." .....
-- Upcoming public training: UBL and code lists Oct 1/5; Madrid Spain World-wide corporate, govt. & user group XML, XSL and UBL training RSS feeds: publicly-available developer resources and training G. Ken Holman mailto:gkholman@xxxxxxxxxxxxxxxxxxxx Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995) Male Cancer Awareness Jul'07 http://www.CraneSoftwrights.com/s/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] segmenting a paragraph, Christian Wittern | Thread | RE: [xsl] segmenting a paragraph, Michael Kay |
[xsl] segmenting a paragraph, Christian Wittern | Date | RE: [xsl] segmenting a paragraph, Michael Kay |
Month |