Re: [xsl] Does 'Lecœur' occur in $text? Do you have a multi-factor XPath solution?

Subject: Re: [xsl] Does 'Lecœur' occur in $text? Do you have a multi-factor XPath solution?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Fri, 18 Jan 2013 22:59:57 +0000
If you want to write queries that handle all the nuances of natural language text, I would strongly recommend using a text retrieval language rather than XPath. Many XQuery implementations have free text retrieval modules.

Michael Kay
Saxonica

On 18/01/2013 22:12, Costello, Roger L. wrote:
Hi Folks,

I want to determine if 'Lecur' occurs in $text.

A naove solution is this XPath expression:

contains($text, 'Lecur')

However, that does not take into account many important factors:

1. Perhaps 'Lecur' occurs, but in $text it is in uppercase

2. Perhaps instead of the '' ligature, $text uses 'oe'

3. Perhaps in $text 'Lecur' is split over two lines and thus is hyphenated

4. Perhaps 'Lecur' is slightly misspelled in $text and therefore requires fuzzy matching

And there are many other important factors.

Do you have an XPath solution to this problem that takes into account the many important factors?

/Roger

Current Thread