Subject: [xsl] distinct-values() optimization, sorting by frequency From: "James Cummings" <cummings.james@xxxxxxxxx> Date: Fri, 8 Feb 2008 14:27:56 +0000 |
Hiya, I'm wondering the best way to optimize a distinct-values() based transformation. What I'm basically doing is: ====== <xsl:variable name="docs" select="collection('../../working/xml/files.xml')"/> <xsl:template name="main" > <xsl:variable name="persNames" select="$docs//tei:text//tei:persName"/> <xsl:variable name="norm-persNames" select="$persNames/normalize-space(lower-case(.))"/> <xsl:variable name="distinct-persNames" select="distinct-values($norm-persNames)"/> <!-- I realize that I could be more specific on the $persNames variable, but doing so doesn't seem to affect speed much at all. --> <div type="main"> <!-- Some overall counts --> <div><head>Overall Counts</head> <list type="unordered"> <item>Number of <gi>persName</gi> elements total: <xsl:value-of select="count($persNames)"/></item> <item>Number of <gi>persName</gi> elements which have a @key attribute total: <xsl:value-of select="count($persNames[@key])"/></item> <item>Number of distinct-value <gi>persName</gi> elements total: <xsl:value-of select="count($distinct-persNames)"/></item> </list></div> <!-- An Alphabetical List --> <div><head>Alphabetical List</head> <list type="unordered"> <xsl:for-each select="$distinct-persNames"> <xsl:sort select="."/> <xsl:variable name="current-name" select="."/> <xsl:variable name="count-distinct-current-name" select="count($persNames[normalize-space(lower-case(.)) =$current-name])"/> <item><xsl:value-of select="concat($current-name, ' -- ', $count-distinct-current-name)"/></item> </xsl:for-each> </list> </div> <!-- A Frequency Sorted List --> <div> <head>Frequency List</head> <list type="unordered"> <xsl:for-each select="$distinct-persNames"> <xsl:sort select="count($persNames[normalize-space(lower-case(.)) = .])"/> <!-- I think it is this sort statement which slows things down, since I have to repeat it twice. --> <xsl:variable name="current-name" select="."/> <xsl:variable name="count-distinct-current-name" select="count($persNames[normalize-space(lower-case(.)) = $current-name])"/> <item><xsl:value-of select="concat($count-distinct-current-name, ' -- ', $current-name)"/> </item> </xsl:for-each> </list> </div> </div> ====== I think the real slow-down comes in the second xsl:for-each where I want to sort by frequency of distinct-value by doing: <xsl:sort select="count($persNames[normalize-space(lower-case(.)) = .])"/> I have to have it for the sort, and then I have to re-do it for the output inside the <item> element. I'm obviously not allowed a variable between the for-each and the sort... but I have a feeling I'm missing some clever optimization here. Although this is for a pre-generated transformation, it currently takes a *hugely* long time, and I'm thinking I must be able to optimize it somehow. Any suggestions appreciated, -James
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] RE: Ignore case while gro, Michael Kay | Thread | [xsl] Re: distinct-values() optimiz, James Cummings |
RE: [xsl] RE: Ignore case while gro, Michael Kay | Date | [xsl] Re: distinct-values() optimiz, James Cummings |
Month |