Subject: Re: [xsl] Linenumbering & word index From: James Cummings <James.Cummings@xxxxxxxxxxxxxx> Date: Fri, 6 Aug 2004 14:41:24 +0100 (BST) |
On Fri, 6 Aug 2004, David Carlisle wrote: > > I lost or forgot the start of this thread so I'll ignore your main > questions but I can answer one of the questions in comments Right, I'll start from the beginning again then. In a document with a lot of poems laid out as: <div type="poem"> <head>headers should be included in word index</head> <lg> <l>This is a line that really should be included</l> <l>This is a line that should be included</l> </lg> <p>This shouldn't be included</p> <lg> <l>This is a line that really should be included</l> <l>This is a line that should be included</l> </lg> </div> What I want to produce is a word-index of poem number and line number, something like: a (4) -- 1:1, 1:2, 1:3, 1:4, 2:3, 2:5 (well, no poem 2 here ;-) ) be (5) -- 1:head, 1:1, 1:2, 1:3, 1:4 ... really (2) -- 1:1, 1:3, 2:1, 2:3 (if it was in poem 2 as well) I had previously done word frequency lists as: ------- <xsl:template match="/"> <xsl:for-each-group select="tokenize(lower-case(string(translate(.,',.!:;',' '))),'\s+')[string(.)]" group-by="."> <xsl:sort />[<xsl:value-of select="."/> - <xsl:value-of select="count(current-group())"/>] </xsl:for-each-group> </xsl:template> ------ And Mike suggested I first build a temporary tree something like: <xsl:variable name="words"> <xsl:for-each select="tokenize(., '\s+')"> <word value="{.}" position="{position()}"/> </xsl:for-each> But I don't see how I a) tokenize only the output of l/text() and head/text() (it complains of multiple inputs when I do so), and b) how I get line-number and poem-number based on position()? -------------- My completely messed up xsl so far is: <xsl:template match="l/text()"> <xsl:for-each-group select="$words" group-by="."> <xsl:sort/> <xsl:value-of select="word/@value"/> -- <xsl:for-each select="current-group()"> <a href="#{concat('poem',@poemnumber,'line',@linenumber)}"> <xsl:value-of select="@poemnumber"/>:<xsl:value-of select="@linenumber"/></a> </xsl:for-each> </xsl:for-each-group> </xsl:template> <xsl:variable name="words"> <xsl:for-each select="tokenize(lower-case(string(translate(.,',.!:;',' '))),'\s+')[string(.)]"> <!-- How do I only match text in 'head' and 'l' elements? --> <xsl:variable name="poemnumber"> <!-- How do I get poem number here? i.e. xsl:number count="div[@type='poem'] when I was matching 'l' " --> </xsl:variable> <xsl:variable name="linenumber"> <!-- How do I get line number here? i.e. xsl:number from="div[@type='poem'] when I was matching 'l'--> </xsl:variable> <word value="{.}" litposition="{position()}" poemnumber="$poemnumber" linenumber="$linenumber"/> </xsl:for-each> </xsl:variable> <!-- some of the things I don't want to match --> <xsl:template match="teiHeader|foreign|p|milestone|gap" priority="-1" /> ------------------ Does that clarify my confuddled state of mind? -James --- Dr James Cummings, Oxford Text Archive, University of Oxford James dot Cummings at oucs dot ox dot ac dot uk CALL FOR PAPERS: Digital Medievalism (Kalamazoo) and Early Drama (Leeds) see http://users.ox.ac.uk/~jamesc/cfp.html
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Linenumbering & word inde, David Carlisle | Thread | Re: [xsl] Linenumbering & word inde, David Carlisle |
Re: [xsl] Linenumbering & word inde, David Carlisle | Date | RE: [xsl] Preserve HTML formatting , Karl J. Stubsjoen |
Month |