HI all,
As part of a small pilot project, I'm implementing a set of spelling
normalization rules applied through XSLT 2.0 using Saxon 9. One
operation that happens extremely frequently is a dictionary lookup;
basically I'm checking a word form to see if it appears in a
spell-checker dictionary.
The dictionary currently consists of a whitespace-separated text string
(although it could be formatted any way I choose), and I've been using
fn:matches() and fn:contains() to check whether or not the form appears
in the dictionary:
<xsl:function name="f:wordExists" as="xs:boolean">
<xsl:param name="inString" as="xs:string"/>
<xsl:value-of select="contains($dictModern, concat(' ',
lower-case($inString), ' '))"/>
</xsl:function>
<xsl:function name="f:wordExists" as="xs:boolean">
<xsl:param name="inString" as="xs:string"/>
<xsl:value-of select="matches($dictModern, concat('\s', $inString),
'\s', 'i')"/>
</xsl:function>
Both options appear to be very costly in terms of time, and I'm
wondering what the most efficient way to do this might be. Is there a
faster way to do text lookups like this?
Ultimately I guess I'll implement this as an external Java process, but
for the moment I'm working with XSLT, and I'd like to get some speed
improvement if I can.
All help appreciated,
Martin