Subject: Re: [xsl] sorting titles w stopwords but w/o value in every title node From: "cking" <cking@xxxxxxxxxx> Date: Thu, 2 Sep 2004 10:44:24 +0200 |
Susan, > I'm sorry for the delay in responding. A large tree fell on my house > about 1 AM Tuesday morning and I have been away from work finding > a tree service and contractors, etc. It's quite a challenge. wow, I can believe that... and I thought this stylesheet was quite a challenge! I have been thinking about the sorting problem: 1. if a record doesn't have a title, we can look it up (by its doc-number), let's call it "found-title" 2. the sort procedure should use the "found-title" rather than the actual title. no: actually it should use the "found-title-without-stopwords". 3. the output shows the actual title (empty, if it's empty) Problem: can't use variables or if-constructs because xsl:sort must be first child of xsl:for-each. The solution so far uses "actual-title-without-stopwords" (can be empty) by means of the "Becker method" [1] <xsl:sort select="concat(substring(substring-after(.,' '), 0 div boolean ($stop-words[starts-with(translate(current(), $uppercase, $lowercase), concat(translate(., $uppercase, $lowercase), ' '))])), substring(., 0 div not ($stop-words[starts-with(translate(current(), $uppercase, $lowercase), concat(translate(., $uppercase, $lowercase), ' '))])))"/> I tried to put a "found-title" inside the xsl:sort select, but I couldn't make it work. > The processor is Saxon but it's being called from within another application. > I do not believe I can do a two-step process. But Saxon does support exsl:node-set [2] so it should be possible to generate a temporary tree (pun not intended!!) and transform that in a second pass, within one stylesheet. You could create a global variable with a structure like <sort-titles> <title doc-number="53690">american artist</title> <title doc-number="57769">american city & country</title> <title doc-number="58345">american demographics</title> <title doc-number="58615">forbes.</title> </sort-titles> and then use <xsl:sort select="exsl:node-set($sort-titles)/*[@doc-number=$doc-number]"/> Using exsl:node-set also means that you don't need the "Becker hack" anymore, improving maintainability. Here's a stylesheet that sorts titles correctly: <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exsl="http://exslt.org/common" xmlns:sw="http://my.stopwords/sw" extension-element-prefixes="exsl sw" > <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" /> <sw:stop> <sw:word>the</sw:word> <sw:word>a</sw:word> <sw:word>an</sw:word> </sw:stop> <xsl:variable name="stop-words" select="document('')/xsl:stylesheet/sw:stop/sw:word"/> <xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/> <xsl:variable name="lowercase" select="'abcdefghijklmnopqrstuvwxyz'"/> <xsl:variable name="sort-titles"> <xsl:for-each select="//section-02"> <xsl:if test="string(title)"> <title doc-number="{doc-number}"> <xsl:variable name="lower-title" select="translate(title, $uppercase, $lowercase)"/> <xsl:choose> <xsl:when test="$stop-words[starts-with($lower-title, concat(translate(., $uppercase, $lowercase), ' '))]"> <xsl:value-of select="substring-after($lower-title,' ')"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="$lower-title"/> </xsl:otherwise> </xsl:choose> </title> </xsl:if> </xsl:for-each> </xsl:variable> <xsl:template match="/"> <html xmlns="http://www.w3.org/1999/xhtml"><head><title>sort without stop words</title></head><body> <table border="1"> <tr> <th>doc-number</th> <th>title</th> <th>description</th> <th>arrival-date</th> </tr> <xsl:for-each select="//section-02/title"> <xsl:sort select="exsl:node-set($sort-titles)/*[@doc-number = current()/../doc-number]"/> <xsl:sort select="number(concat(substring(../arrival-date, 7,4), substring(../arrival-date, 1,2), substring(../arrival-date, 4,2)))" order="descending"/> <tr> <td><xsl:value-of select="../doc-number"/></td> <td><xsl:value-of select="."/></td> <td><xsl:value-of select="../description"/></td> <td><xsl:value-of select="../arrival-date"/></td> </tr> </xsl:for-each> </table> </body></html> </xsl:template> </xsl:stylesheet> Saxon 6.5.3 output: doc-number title description arrival-date 53690 American Artist v.68:no.738(2004:Jan.) 02/26/2004 57769 v.119:no.3(2004:Mar.) 03/25/2004 57769 The American city & country v.119:no.1(2004:Jan.) 02/11/2004 58345 v.26:no.3(2004:Apr.) 04/12/2004 58345 v.26:no.2(2004:Mar.) 03/06/2004 58345 American demographics v.26:no.1(2004:Feb.) 02/05/2004 58615 v.173:no.5(2004:Mar.15) 03/15/2004 58615 v.173:no.2(2004:Feb. 02) 01/21/2004 58615 Forbes. v.173:no.1(2004:Jan. 12) 01/12/2004 The records without a title are sorted in their correct position, now. One problem seems to remain: the titles tend to display in the last record, rather than the first, because the dates are sorted descending. But that shouldn't be too difficult to solve. I wish this will work in your application... and I wish you strength and all else you can use to solve the other tree challenge too! Best regards Anton Triest [1] http://www.biglist.com/lists/xsl-list/archives/200008/msg00525.html [2] http://exslt.org/exsl/functions/node-set/index.html
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] sorting titles w stopword, Susan Campbell | Thread | Re: [xsl] sorting titles w stopword, cking |
[xsl] RE: Small Caps Solution (a bi, christof.hoeke | Date | Re: [xsl] Multiple matches against , Kevin Jones |
Month |