[xsl] Re: Similarity metric in XSLT 2?

On 12-03-30 01:24 PM, Imsieke, Gerrit, le-tex wrote:

I can only affirm that I'd be interested in such a library, too.

The last time that I needed string similarity metrics (4 yrs ago), I
used Perl with XML::LibXML and String::Similarity.

If there were such a module / extension function for XPath / XSLT, I'd
probably used it more often. If you find a Java library that is easy to
interface with from Java-based XSLT processors, please let me know. I
think that Levenshtein or more advanced algorithms will be too slow when
implemented in XSLT, but may be readily available as an extension function.

I once implemented the Universal Similarity Metric (Normalized Compression Distance) in Pascal and Java:

<http://dh2010.cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab-693.html>

and found that it was surprisingly effective for short strings, as well as being very fast. I might look at figuring out how to call the Java library from Saxon. Implementing the metric was trivial.

Cheers,
Martin

Gerrit

On 2012-03-30 20:18, Martin Holmes wrote:

Hi all,

I'm faced with a situation in which I have to match an input string
against a set of possible candidates, and I need to find the match which
is most similar to it (I'm trying to identify correspondences between
two sets of files which have similar, but not identical, content).

Has anyone done anything like measuring string similarity in XSLT 2.0?
If so, how did you approach it?

All help appreciated,
Martin

<- Previous	Index	Next ->
Re: [xsl] Similarity metric in XSLT, Michael Kay	Thread	Re: [xsl] Similarity metric in XSLT, Markus Flatscher
Re: [xsl] Similarity metric in XSLT, Michael Kay	Date	[xsl] is it possible to resize an i, David Ryan
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home