Subject: Re: [xsl] Linenumbering & word index From: James Cummings <James.Cummings@xxxxxxxxxxxxxx> Date: Fri, 6 Aug 2004 17:39:34 +0100 (BST) |
On Fri, 6 Aug 2004, David Carlisle wrote: > You can't do > tokenize(l/text(), '\s+') > because it wants a single string as its first argument and that's > probably more than one. Yup. And that's one of the places I was getting confuddled. :-( > You can do > select="for $l in l return tokenize($l,'\s+')" > or same with for-each and tokenize them one at a time. ok, I think I understand that, and might work for smaller things. > however you really want to make yourself a tree first something like: > Let's see if I understand the way this works. (I do like getting solutions, but also want to learn ;-) ) > <xsl:template match="/"> > <xsl:variable name="x"> > <xsl:apply-templates mode="a" select="div[@type='poem']"/> > </xsl:variable> Creates variable $x from the templates of mode a below for only the poem divs. (See, now *that* is how to avoid the stuff I don't want to include.. *doh*) > [ > <xsl:copy-of select="$x"/> > ] Copy of the temporary tree listing each poem, and word in line for that poem. > <xsl:for-each-group select="$x/div/l/word" group-by="."> Groups by each word in the temporary tree and sorts them outputting the word > <xsl:sort /> > <xsl:text> </xsl:text> > <xsl:value-of select="."/> then for each instance of a word (keys always confuse me) it outputs the @poem and @n line numbers. > <xsl:for-each select="key('w',.)"> > <xsl:text> </xsl:text> > <xsl:value-of select="../../@poem"/>:<xsl:value-of select="../@n"/> > </xsl:for-each> > </xsl:for-each-group> > </xsl:template> > Applies the original mode a match for divs only to head and lg/l (modes...yes, must use modes more.) > <xsl:template mode="a" match="div"> > <div poem="{position()}"> > <xsl:apply-templates mode="a" select="head"/> > <xsl:apply-templates mode="a" select="lg/l"/> > </div> > </xsl:template> > When you find a head, tokenize it into a temporary tree of <word> elements > <xsl:template mode="a" match="head"> > <l n="head"> > <xsl:for-each select="tokenize(.,'(\s|[,\.!])+')"> > <word><xsl:value-of select="lower-case(.)"/></word> > </xsl:for-each> > </l> > </xsl:template> > When you find a l tokenize it into a temporary tree of <word> elements, recording the line's position > <xsl:template mode="a" match="l"> > <l n="{position()}"> > <xsl:for-each select="tokenize(.,'\s+')"> > <word><xsl:value-of select="."/></word> > </xsl:for-each> > </l> > </xsl:template> > For each <word> element that we've just created make a key of name w. > <xsl:key name="w" match="word" use="."/> Seems to work absolutely perfectly. (well, I'll customise the tokenize string...) Many many thanks. -James --- Dr James Cummings, Oxford Text Archive, University of Oxford James dot Cummings at oucs dot ox dot ac dot uk CALL FOR PAPERS: Digital Medievalism (Kalamazoo) and Early Drama (Leeds) see http://users.ox.ac.uk/~jamesc/cfp.html
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Linenumbering & word inde, David Carlisle | Thread | Re: [xsl] Linenumbering & word inde, David Carlisle |
RE: [xsl] Automatic generation of X, Pieter Reint Siegers | Date | Re: [xsl] Preserve HTML formatting , David Carlisle |
Month |