Subject: [xsl] Re: Re: text() word lists From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx> Date: Sun, 8 Feb 2004 21:30:18 +0100 |
Thank you Mike and David, Both stylesheets are performing extremely well. I ran them on the complete xml version of Hamlet and the results are: 657 milliseconds and 781 milliseconds -- respectively David's and Mike's transformation. Even though my computer was 3GHz 2GB RAM these results are fantastic. I think, these XSLT 2.0 examples completely disspell the myth that XSLT is not to be used for (efficient) text processing. Cheers, Dimitre Novatchev. FXSL developer, http://fxsl.sourceforge.net/ -- the home of FXSL Resume: http://fxsl.sf.net/DNovatchev/Resume/Res.html "Michael Kay" <mhk@xxxxxxxxx> wrote in message news:000001c3ee5d$35306880$6401a8c0@xxxxxxxxxx > Sorry for the buggy code. Here is a working version: > > <xsl:stylesheet version="2.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output indent="yes"/> > <xsl:template match="/"> > <frequencies> > <xsl:for-each-group group-by="." select=" > for $w in tokenize(string(.), '[\s.?!,]+')[.] return lower-case($w)"> > <xsl:sort select="count(current-group())" order="descending"/> > <word><xsl:value-of select="current-grouping-key(), ' - ', > count(current-group())"/></word> > </xsl:for-each-group> > </frequencies> > </xsl:template> > </xsl:stylesheet> > > (The predicate [.] elimitates the zero-length string) > > Here's the start of the output for othello.xml: > > <?xml version="1.0" encoding="UTF-8"?> > <frequencies> > <word>i - 816</word> > <word>and - 794</word> > <word>the - 762</word> > <word>to - 591</word> > <word>of - 476</word> > <word>you - 458</word> > <word>a - 445</word> > <word>my - 427</word> > <word>that - 368</word> > <word>iago - 351</word> > <word>in - 336</word> > <word>othello - 323</word> > <word>not - 313</word> > <word>it - 306</word> > <word>is - 286</word> > <word>me - 256</word> > <word>cassio - 236</word> > <word>for - 234</word> > <word>with - 222</word> > <word>be - 220</word> > <word>he - 220</word> > <word>this - 217</word> > <word>desdemona - 217</word> > <word>but - 217</word> > <word>do - 212</word> > <word>your - 207</word> > <word>have - 203</word> > <word>her - 202</word> > <word>what - 178</word> > <word>him - 171</word> > <word>his - 166</word> > <word>as - 166</word> > <word>she - 155</word> > <word>so - 151</word> > <word>will - 146</word> > <word>o - 143</word> > <word>thou - 142</word> > <word>if - 137</word> > <word>emilia - 136</word> > <word>by - 112</word> > > Michael Kay > > > > > >> > Sorted by descending frequency: >> > >> > <xsl:for-each-group select=" >> > for $w in tokenize(string(foo), "[\s.?!]*") return >> lower-case($w)"> >> > <xsl:sort select="count(current-group())" order="descending"/> >> > <xsl:value-of select="current-grouping-key(), ' - ', >> > count(current-group())"/> </xsl:for-each> >> >> Sorry, but cannot make this work. >> >> First had to remove the nested quotes. Then to change the ending tag. >> >> Now I get the message: >> >> "Error at xsl:for-each-group on line 10 of file:/(Untitled): >> Exactly one of the attributes group-by, group-adjacent, >> group-starting-with, and group-ending-with must be specified" >> >> Probably this is something trivial, but this is the first >> time I'm trying an XSLT 2.0 grouping example. >> >> >> Cheers, >> >> Dimitre Novatchev. >> FXSL developer, >> > http://fxsl.sourceforge.net/ -- the home of FXSL > Resume: http://fxsl.sf.net/DNovatchev/Resume/Res.html > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Re: text() word lists, Michael Kay | Thread | Re: [xsl] text() word lists, Dimitre Novatchev |
RE: [xsl] Generate list of XSL Styl, Jim Fuller | Date | RE: [xsl] increment value - philoso, Govil, Anoop (Contra |
Month |