Re: [xsl] Re: text() word lists

Subject: Re: [xsl] Re: text() word lists
From: David Carlisle <davidc@xxxxxxxxx>
Date: Mon, 9 Feb 2004 10:55:13 GMT
  So what is the best way to parameterise these to allow
  turning on/off the removal of numbers?  And while
  we're at it, turning on/off the removal of hyphens or
  other possibly-word-forming characters?

The  second argument to tokenize which is what is being used to specify
the "inter word space/punctuation"can include or not the numbers, or
hyphens etc, it is a general string valued Xpath so in particular you
can make up the regexp on the fly using concat() or string-join()
passing in some parameters as needed.

taokenize(.,concat('(',$space,'|[$punct,$nums,$other,'])+'))

then you can set
<xsl:param name="space" select="'\s'"/>
<xsl:param name="punct" select="'!.,;:\?'"/>
<xsl:param name="nums" select="''"/> <!-- or '0-9' -->
<xsl:param name="other" select="''"/> <!-- or 'whatever you want ' -->



-- 
http://www.dcarlisle.demon.co.uk/matthew

________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread