[xsl] Linenumbering & word index

Subject: [xsl] Linenumbering & word index
From: James Cummings <James.Cummings@xxxxxxxxxxxxxx>
Date: Thu, 5 Aug 2004 17:21:13 +0100 (BST)
Ok, following the helpful advice on linenumbering 
for displaying poems.  I also want to create a 
word index to these same poems.  

So again given something like:
<body>
<header>
   <title>A poem that should really, certainly, be included</title>
</header>
<div type="poem">
<head>headers should be included in word index</head>
<lg>
<l>This is a line that should be included</l>
<l>This is a line that should be included</l>
</lg>
<lg>
<l>This is a line that really should be included</l>
<l>This is a line that should be included</l>
</lg>
</div>
<div type="poem">
<head>headers should certainly be included in word index</head>
<lg>
<l>This is a line that really should be included</l>
<l>This is a line that <supplied>should</supplied> certainly be included</l>
</lg>
<lg>
<l>This is a line that really should be included</l>
<!-- etc -->
</lg>
</div>
</body>

What I want to output is a list counting and indexing all 
the words inside <l> and <head> listing poem number and 
line number, so something like:
------------
certainly (2): 2:head, 2:2.
really (3): 1:3, 2:1, 3:3.
should (9): 1:head, 1:1, 1:2, 1:3, 1:4, 2:head, 2:1, 2:2, 2:3.
-----------
(well really, I'll do an xml version, but you get the picture)

Now, I had done a word-frequency-list-of-entire-file before 
by using:

---------- 
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="2.0";>
<xsl:template match="/">
<xsl:for-each-group select="tokenize(lower-case(string(translate(.,',.!:;',' '))),'\s+')[string(.)]" group-by=".">
 <xsl:sort />[<xsl:value-of select="."/> - <xsl:value-of select="count(current-group())"/>]
 </xsl:for-each-group>
 </xsl:template>
 </xsl:stylesheet>
---------- 

But can't see how to get the word position whilst tokenizing the 
whole lot? Everything I try doesn't work.

Suggestions?

-James
---
Dr James Cummings, Oxford Text Archive, University of Oxford
James dot Cummings at oucs dot ox dot ac dot uk 

Current Thread