RE: [xsl] Re: text() word lists

Subject: RE: [xsl] Re: text() word lists
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Mon, 9 Feb 2004 11:27:37 -0000
> 
> Not that I understand it,
> but ( and ) seem to be included Michael?
>  <word>)   -   71</word>
>  <word>(this   -   11</word>
>   
> 
> Is it modify by updating 
>      for $w in tokenize(string(.), '[\s.?!,]+')[.] return 
> line?
> 
> for $w in tokenize(string(.), '[\s.?!, )(]+')[.] return 
> seems to work.

I only spent five minutes on this: producing a decent natural language
tokenizer takes a little bit longer than that! Obviously its easy to
write a more intelligent regex, I was only trying to illustrate the
principles.

Michael Kay


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread