Re: [xsl] XSLT Text Processing: Fun with Anagrams

Subject: Re: [xsl] XSLT Text Processing: Fun with Anagrams
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Tue, 24 Apr 2007 19:35:40 -0700
Hi Rashmi,

I got everything and
tried the 3 XSL sheets you mentioned.

I am glad you were successful in running the Anagram transformations.


Everthing looks great in the output except for one small thing: I
noticed that words with apostrophes were also being considered.

For example:
<aChain key="'aainprsst"><w>aspirant's</w><w>partisan's</w></aChain>


I am using an English dictionary, which I took long time ago from an open source spelling project and just converted to XML -- so I didn't have a say in the composition of the dictionary and am using it "as is".

-------------------------------------------------------------

However I wasn't able to try one more style sheet successfully:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
java -jar c:\dev\saxonb8-9-0-3j\saxon8.jar dummy.xml
testGetAnagrams.xsl > output.xml

I think testGetAnagrams.xsl is supposed to be a stand-alone stylesheet?

No, I'm sorry I didn't specify the source xml in a comment, however I said this in my blog:

"Here, I am reusing the same 46379 English wordforms dictionary, I was
using for the spelling checking tasks. In fact, the transformation is
applied on it -- the document dictEnglish.xml."



So, the transformation is applied on the dictionary itself.


Be warned, as already stated in my blog, that this transformation will take quite long, because it first indexes the huge dictionary file on Anagram keys, before performing the search for anagrams.

This is why, it pays off very much to create the specialised Anagrams
Dictionary in one single transformation (which you already did) and
then to use it together with testGetDictAnagrams.xsl



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play



On 4/24/07, Rashmi Rubdi <rashmi.sub@xxxxxxxxx> wrote:
Hi Dimitre and everyone,

Thank you for your help and for additional instructions.

It took me a while to figure out the CVS repository since I'm using it
the first time on SourceForge -- but anyways, I got everything and
tried the 3 XSL sheets you mentioned.

I could see the output for:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
java -jar c:\dev\saxonb8-9-0-3j\saxon8.jar dictEnglish.xml
testGenerateAnagramDict.xsl > output.xml

Everthing looks great in the output except for one small thing: I
noticed that words with apostrophes were also being considered.

For example:
<aChain key="'aainprsst"><w>aspirant's</w><w>partisan's</w></aChain>

I think the above should be (if I'm not wrong with my understanding of
English words):
<aChain key="'aainprsst"><w>aspirants</w><w>partisans</w></aChain>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Then I tried :

--------------------------------------------------------------
java -jar c:\dev\saxonb8-9-0-3j\saxon8.jar dictAnagrams.xml
testGetDictAnagrams.xsl > output2.xml which worked out great.

-------------------------------------------------------------

However I wasn't able to try one more style sheet successfully:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
java -jar c:\dev\saxonb8-9-0-3j\saxon8.jar dummy.xml
testGetAnagrams.xsl > output.xml

I think testGetAnagrams.xsl is supposed to be a stand-alone stylesheet?

I don't think there's an option to call an XSL stylesheet without an
input XML, so I created a dummy xml file with just one node, but got a
blank output.

Please let me know if testGetAnagrams.xsl expects an input XML, if so
what it should be.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Overall, this is excellent demonstration of XSLT's capabilities.

-Thank you
Rashmi

On 4/24/07, Dimitre Novatchev <dnovatchev@xxxxxxxxx> wrote:
> Hi Rashmi,
>
>
> You should get the FXSL distribution from CVS, not from the zip files,
> which are almost one year old.
>
> The files you'd be using for the anagrams solution are:
>
>     testGetAnagrams.xsl  Gets anagrams with a specific key, without an
> anagram dict.
>
>     testGenerateAnagramDict.xsl Produces an Anagram dict from the
> regular English dct
>
>     testGetDictAnagrams.xsl Gets anagrams with a specific key, using
> an anagram dict
>
>
> Please, do let me know if you still  have any problems accessing the sources.
>
>
>
> --
> Cheers,
> Dimitre Novatchev

Current Thread