Subject: Re: [xsl] XSLT Solution for hyphenation From: Jeff Sese <jsese@xxxxxxxxxxxx> Date: Thu, 28 Dec 2006 09:56:15 +0800 |
<root> <p>I have some text that has the words abaissassent and abandonnent.</p> </root>
<root> <wordlist> <entry> <search>abaissassent</search> <replace>abais­sassent</replace> </entry> <entry> <search>abaisshrent</search> <replace>abais­shrent</replace> </entry> <entry> <search>abandonnent</search> <replace>aban­donnent</replace> </entry> </wordlist> </root>
You seem to be doing exact matching on the words in your dictionary, not regular expression matching as your use of matches() would suggest. With exact matching you can use a key for the lookup which will be dramatically faster.
Michael Kay
http://www.saxonica.com/
-----Original Message-----[\]\^\$|]',
From: Jeff Sese [mailto:jsese@xxxxxxxxxxxx] Sent: 22 December 2006 06:10
To: Xsl-List
Subject: [xsl] XSLT Solution for hyphenation
Hi list,
I have this project that applies hyphenation to an XML document using a list of words as a reference. The list of words can reach up to a million entries.
My XSLT solution was having a template that matches text() nodes then insert hyphens to the matching words that are in the list. However the transformation takes to long to finish even for a relatively small file (around 1mb). Is there anyway to speed this or is there a better solution?
Here's my stylesheet:
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="@*|element()|comment()|processing-instruction()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:variable name="str" select="."/>
<xsl:variable name="searchStrs" as="xs:string*" select="$search-words[matches($str,.)]/replace(.,'[.\\?*+{}()\
'\\$0')"/>ds-to-replace,1))
<xsl:value-of select="ati:replace-all($str,$searchStrs,$replaceStr)"/>
</xsl:template>
<xsl:function name="ati:replace-all">
<xsl:param name="input" as="xs:string"/>
<xsl:param name="words-to-replace" as="xs:string*"/>
<xsl:sequence select="if (exists($words-to-replace)) then ati:replace-all(replace($input, $words-to-replace[1],
key('replace',$words-to-replace[1],$search-words)),remove($wor
else $input"/> </xsl:function>
heres a sample of the look-up table:
<root> <wordlist> <entry> <search>abaissassent</search> <replace>abais­sassent</replace> </entry> <entry> <search>abaisshrent</search> <replace>abais­shrent</replace> </entry> <entry> <search>abandonnent</search> <replace>aban­donnent</replace> </entry> </wordlist> </root>
so if i have a "abaissassent" in a text() node this will be replaced with "aban­donnent".
-- *Jeff*
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] XSLT Solution for hyphena, Michael Kay | Thread | Re: [xsl] XSLT Solution for hyphena, Wendell Piez |
Re: [xsl] row and column separator , J.Pietschmann | Date | [xsl] XPath behaviour, Lars Rönnbäck |
Month |