Subject: RE: [xsl] Re: text() word lists From: "Michael Kay" <mhk@xxxxxxxxx> Date: Mon, 9 Feb 2004 11:27:37 -0000 |
> > Not that I understand it, > but ( and ) seem to be included Michael? > <word>) - 71</word> > <word>(this - 11</word> > > > Is it modify by updating > for $w in tokenize(string(.), '[\s.?!,]+')[.] return > line? > > for $w in tokenize(string(.), '[\s.?!, )(]+')[.] return > seems to work. I only spent five minutes on this: producing a decent natural language tokenizer takes a little bit longer than that! Obviously its easy to write a more intelligent regex, I was only trying to illustrate the principles. Michael Kay XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Re: text() word lists, David . Pawson | Thread | RE: [xsl] Re: text() word lists, David . Pawson |
RE: [xsl] converting flat xml data , Andreas L. Delmelle | Date | RE: [xsl] converting flat xml data , Michael Kay |
Month |