|
Subject: Re: [xsl] sorting a list of titles after removal of stopwords and special characters From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx> Date: Tue, 11 Dec 2001 17:09:11 +0000 |
Trevor Nash wrote:
> What you need is an expression that, given the context of a title
> element, will return a string containing the edited title (stop words
> removed). This cannot be done with standard XSLT, but you have three
> possibilities:
Actually, it's not *impossible* with standard XSLT, although
admittedly it isn't pretty. Assuming that $punctuation is a string
holding the ignorable punctuation characters and that the list of
stopwords were sorted such that 'an' comes before 'a' rather than
after it, you could use:
concat(
substring(
substring(translate(title, $punctuation, ''),
string-length(
$stoplist[starts-with(
translate(current()/title,
concat($lowercase, $punctuation),
$uppercase),
translate(., $lowercase, $uppercase))]) + 2),
1 div boolean($stoplist[starts-with(
translate(current()/title,
concat($lowercase, $punctuation),
$uppercase),
translate(., $lowercase, $uppercase))])),
substring(
translate(title, $punctuation, ''),
1 div not($stoplist[starts-with(
translate(current()/title,
concat($lowercase, $punctuation),
$uppercase),
translate(., $lowercase, $uppercase))])))
If we were using XPath 2.0, assuming an if statement similar to
that in XQuery, it would look something like:
if ($stoplist[starts-with(
translate(current()/title,
concat($lowercase, $punctuation),
$uppercase),
translate(., $lowercase, $uppercase))])
then substring(translate(title, $punctuation, ''),
string-length(
$stoplist[starts-with(
translate(current()/title,
concat($lowercase, $punctuation),
$uppercase),
translate(., $lowercase, $uppercase))]) + 2)
else translate(title, $punctuation)
which isn't that much more pleasant.
If the stop words were stored with a space, as:
<ignore>the </ignore>
<ignore>an </ignore>
<ignore>a </ignore>
(which would probably a good idea anyway, given that quite a few
titles might begin with the letter 'A') then you could use simply:
substring(translate(title, $punctuation, ''),
string-length(
$stoplist[starts-with(
translate(current()/title,
concat($lowercase, $punctuation),
$uppercase),
translate(., $lowercase, $uppercase))]) + 1)
> 1) You are using Saxon, which has an extension saxon:function
> which lets you write a function in XSLT - more or less the
> contents of your mode="with-stoplist" template.
Just to mention, you can also use func:function from the EXSLT
namespace http://exslt.org/functions in Saxon, 4XSLT, jd.xslt and
libxslt to achieve this. It's more portable to use func:function than
to use saxon:function (because it's available in those other
processors), but they do basically the same thing. See
http://www.exslt.org/func for details.
Cheers,
Jeni
---
Jeni Tennison
http://www.jenitennison.com/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] sorting a list of titles , Trevor Nash | Thread | RE: [xsl] Design culture (Was: Desi, Joshua . Kuswadi |
| [xsl] RE: [xsl] RE: [xsl] Re: [xsl], Brinkman, Theodore | Date | RE: [xsl] Dumb questions from a new, Mike Ferrando |
| Month |