[xsl] Muenchian method on nodes with two or more items for indexing

Subject: [xsl] Muenchian method on nodes with two or more items for indexing
From: "Larry Hayashi" <larry_hayashi@xxxxxxxxxxx>
Date: Thu, 19 Sep 2002 22:26:07 -0500
Please forgive my cross-posting but there has been no activity on the other list for more than 16 hours and I suspect the list might not be working. I have worked on this off and on all day and I have been trying to solve this with no success.

I just tried using an axes method with this problem and it took more than 15 minutes to crunch through on a 2 GHZ Pentium with lots of RAM. I need to gets Muenchian working with this! I have tried a number of different things after looking at J. Tennison's page but with no success. Here is the problem:

I have data of the following sort. You will note that minor or major elements and their senses can have one or more index elements.

<LexicalDatabase>
<minor>
<base>'wah 'nabuuysk</base>
<sense num=" 1">
<index enc="ENG">unexpected</index>
</sense>
</minor>
<minor>
<base>'wah wil&#226;ontk</base>
</minor>
<major>
<base>'w&#224;hamaniits'&#224;</base>
<sense num=" 1">
<pos>v</pos>
<def enc="ENG">careless</def>
<index enc="ENG">careless</index>
</sense>
</major>
<major>
<base>xbimooksk</base>
<sense num=" 1">
<pos>n</pos>
<def enc="ENG">half-white </def>
<index enc="ENG">metis</index>
<index enc="ENG">half-white</index>
</sense>
</major>
<major>
<base>xbismsg&#232;&#232;</base>
<sense num=" 1">
<pos>v</pos>
<index enc="ENG">bow your head</index>
<index enc="ENG">bend down</index>
</sense>
</major>
</LexicalDatabase>

What I would like to do is get output a file that has index elements
containing their major or minor entries. It is similar to grouping by last name or city except that each person could have one, two or more of these. Perhaps "Schools attended" would be a good example. Anyhow, here is a sample of what I would like to output.


<IndexList>
<IndexItem value="metis">
<entry base="xbimooksk" baseHom="" />
</IndexItem>
<IndexItem value="microwave">
<entry base="âànuut" baseHom="2"/>
</IndexItem>
<IndexItem value="midday">
<entry base="nsèèlga sah" baseHom=""/>
<entry base="sèèlgyàxsk" baseHom=""/>
</IndexItem>
<IndexItem value="middle (in the _)">
<entry base="lusèèlk" baseHom=""/>
<entry base="xts'a" baseHom=""/>
</IndexItem>
</IndexList>

I did the above using the Muenchian method but found that if a major
or minor element contained more than one index element, it would only
put the first one in the resulting index list. The second one did not
appear.

My stylesheet has the following:

<xsl:key name="entries-by-index" match="LexicalDatabase/*"
use=".//index"/>
<xsl:template match="LexicalDatabase">
<xsl:for-each select="*[generate-id() = generate-id(key('entries-by-
index', .//index)[1])]">
<xsl:sort select=".//index" data-type="text"/>
<IndexItem>
<xsl:attribute name="value"><xsl:value-of select=".//index"/>
</xsl:attribute>
<xsl:for-each select="key('entries-by-index', .//index)">
<entry>
...

Obviously I am doing something wrong. Any ideas?

Larry


_________________________________________________________________ Send and receive Hotmail on your mobile device: http://mobile.msn.com


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



Current Thread