Subject: [xsl] Collect word count with xslt2.0 on saxon 8 From: Karen McAdams <kemcadams@xxxxxxxxx> Date: Mon, 15 May 2006 16:48:12 -0700 (PDT) |
I have the following structure that i need to collect word counts for from each element that has a class attribute that contains " topic/topic " without counting its child elements that also contain the the class attribute " topic/topic " root> <topic class=" topic/topic foo/bar "> <p> communications and information theory</p> <title> top element</title> <relinfo> elements can be nested</relinfo> Generalized Markup Language defined by ISO 8879. <concept class=" topic/topic foo/bar "> <p> communications and information theory</p> <title> top element</title> <relinfo> elements can be nested</relinfo> (for a number of technical reasons beyond the scope of this article). <topic class=" topic/topic foo/bar "> <p> communications and information theory</p> <title> top element</title> <relinfo> elements can be nested</relinfo> maintain repositories of structured documentation for more than a decade, but it is not well <concept class=" topic/topic foo/bar "> But the metrics for XML on the Web <p> communications and information theory</p> <title> top element</title> <relinfo> elements can be nested</relinfo> measures, or are a little polluted by voodoo ideology about good </concept> </topic> </concept> </topic> </root> I have this template that gets the word count for each element and its child elements including the elements that have class attributes that contains " topic/topic ". <xsl:template match="*[contains(@class, 'topic/topic ')]"> <xsl:variable name="level" select="count(ancestor::*[contains(@class, 'topic/topic ')]) + 1"/> <xsl:variable name="ct" select="if ($level = 1) then concat(title,' ') else ' '"/> <xsl:variable name="h1" select="if ($level = 2) then concat(title,' ') else ' '"/> <xsl:variable name="h2" select="if ($level = 3) then concat(title,' ') else ' '"/> <xsl:variable name="h3" select="if ($level = 4) then concat(title,' ') else ' '"/> <xsl:variable name="wc" select="count(tokenize(lower-case(.),'(\s|[,.!:;]|[n][b][s][p][;])+')[string(.)])" /> <xsl:apply-templates/> </xsl:template> I added another template that contains the count of its child elements b <xsl:template match="*[contains(@class, 'topic/topic ')]" mode="filterCount"> <sum> <xsl:value-of select="count(tokenize(lower-case(.),'(\s|[,.!:;]|[n][b][s][p][;])+')[string(.)])"/> </sum> </xsl:template> That I store in a variable and then subtract from the total within in the first template above <xsl:variable name="childcounts"> <sums> <xsl:apply-templates mode="filterCount"/> </sums> </xsl:variable> <xsl:variable name="total-child" select="sum($childcounts/sums/sum)"/> <xsl:variable name="total-roman" select="sum($wc - $total-child)"/> I would like to find a more elegant approach to this because there are also other attributes in this content that need to have the same technique applied to b Would it be a better approach to copy the elements to another document node and then perform the word count which would be applied recursively to all child elements to arrive at the count and what would this template match look like?
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] number() in Xpath overflo, David Carlisle | Thread | Re: [xsl] Collect word count with x, George Cristian Bina |
Re: [xsl] FO: Non-proportional outp, David Carlisle | Date | [xsl] Transforming Tables - repost, Jeff Sese |
Month |