[xsl] Finding and processing index terms

Subject: [xsl] Finding and processing index terms
From: Dan Vint <dvint@xxxxxxxxx>
Date: Wed, 15 Feb 2006 07:14:09 -0800
I have a document that has indexterms scattered in various content and levels within the document. The document is configured to support a publicOnly and a membersOnly version of the document. This is handled with two different attributes that are available on all division tags (div1 to div10) and at the table tag. So a whole division might be tagged public, but a table within that division or a subdivision tag might a members only tag set.

So I have a simplified document like this:

<doc>
<div1><head>Chapter 1</head>
<div2 membersOnly="Yes" statusCR="Yes">
<indexterm
primaryDesc="Yes"><primary>AboveGroundIndicator
type</primary><secondary>definition MEMBERS
INDEX</secondary>
</indexterm>
<indexterm><primary>base
types</primary><secondary>AboveGroundIndicator type MEMBERS
INDEX</secondary>
</indexterm>
<head id="bAboveGroundIndicator">
<baseclass>AboveGroundIndicator</baseclass>
 </head>
<p>Full name: Above Ground Indicator</p>
<p class="CR">CR STATUS INFORMATION</p>
<div-sub membersOnly="Yes" statusCR="Yes">
<head> 	Description</head>
<p>Indicates
whether
the item is above ground.</p>
</div-sub>
<table border="1" id="oTable" membersOnly="Yes" statusCR="Yes"
width="935">
<thead>
<tr valign="top">
<th colspan="15" >  Tag/Type
</th>
<th >Usage 	</th>
<th > 	Description
</th>
</tr>
</thead>
<tbody>
<tr membersOnly="Yes" statusCR="Yes" valign="top">
<td class="nest1">
<p>&#xa0;</p>
</td>
<td class="nest1" colspan="16">
<p> <baseclass
ref="bAboveGroundIndicator">AboveGroundIndicator</baseclass>

<indexterm><primary>AboveGroundIndicator type</primary><secondary>use of
MEMBERS INDEX</secondary>
</indexterm>

(CR STATUS) <baseclass></baseclass>,
Usage: Required
			</p>
</td>
</tr>
<tr membersOnly="Yes" statusCR="Yes" valign="top">
<td class="nest1">
<p>&#xa0;</p>
</td>
<td class="nest1" colspan="16">
<p>End of Type:
<baseclass ref="bAboveGroundIndicator">AboveGroundIndicator</baseclass>
 </p>
</td>
</tr>
</tbody>
</table>


<table border="1" id="oTable" publicOnly="Yes" statusCR="Yes" width="935"> <thead> <tr valign="top"> <th colspan="15" onclick="sorter(3,1)" style="cursor:hand;"> Tag/Type <p HelpOnly="Yes">Select to Sort</p> </th> <th onclick="sorter(2,1)" style="cursor:hand;">Usage <p HelpOnly="Yes">Select to Sort</p> </th> <th onclick="sorter(1,1)" style="cursor:hand;"> Description <p HelpOnly="Yes">Select to Sort</p> </th> </tr> </thead> <tbody> <tr membersOnly="Yes" statusCR="Yes" valign="top"> <td class="nest1"> <p>&#xa0;</p> </td> <td class="nest1" colspan="16"> <p> <baseclass ref="bAboveGroundIndicator">AboveGroundIndicator</baseclass>

<indexterm><primary>AboveGroundIndicator type</primary><secondary>use of
PUBLIC INDEX</secondary>
</indexterm>

 (CR STATUS) 			Usage: Required
</p>
</td>
</tr>
<tr membersOnly="Yes" statusCR="Yes" valign="top">
<td class="nest1">
<p>&#xa0;</p>
</td>
<td class="nest1" colspan="16">
<p>End of Type:
<baseclass ref="bAboveGroundIndicator">AboveGroundIndicator</baseclass>
 </p>
</td>
</tr>
</tbody>
</table>
</div2>
</div1>
</doc>

So this document has a single chapter with a <div2> that is membersOnly=Yes. Inside of it are 2 tables, one is the members version of the table and the other is the public version of this table. So when processing membersOnly=Yes, I want this second table and all its content ignored. My stylesheets handle that on the processing side and drop the extra table with no problem.

My difficulty is occurring in trying to get the embedded <indexterm> information for just those areas that are now in the membersOnly sections. So
I have a template that follows that gathers all the index terms together and try's to group them with the saxon:group function (I'm still using XSLT v1 at this point).


<xsl:template name="processIndexTerms">
<saxon:group
select="//indexterm[ancestor::*[(starts-with(local-name(),'div') or local-name()='table')
and @publicOnly!='Yes']]"
group-by="normalize-space(primary)">

.. other stuff here ...
</saxon:group>
</xsl:template>


Now my DTD has default values of No set for both the publicOnly and membersOnly attributes. First I'm assuming that saxon is reading and using these defaults, it seems to in other stylesheets I've built.

Anyway, the @publicOnly!='Yes' portion of the XPath doesn't seem to be working as expected. It will find anything where it is EQUAL to Yes, but seems to have no effect in the negative version. I've tried the following variations without any success:

#1
//indexterm[ancestor::*[(starts-with(local-name(),'div') or local-name()='table') and @publicOnly!='Yes'][1]]


#2
//indexterm[ancestor::*[(starts-with(local-name(),'div') or local-name()='table')][ @publicOnly!='Yes']]


#3
//indexterm[ancestor::*[(starts-with(local-name(),'div') or local-name()='table')][1][ @publicOnly!='Yes']]


#4
//indexterm[ancestor::*[(starts-with(local-name(),'div') or local-name()='table') and @publicOnly='No']]


//indexterm[ancestor::*[(starts-with(local-name(),'div') or local-name()='table') and not(@publicOnly)]]


I thought the first variation was what I needed, trying to just find the results of the first ancestor of the match, but the results were no different than without, The next produced the same set of results.


#3 reduced the results, but doesn't find all the values inside the div2 elements, and does find the correct entry from the tables. It misses the second indexterm inside the div2 element.

The last version gave me the same results.

Any ideas?

..dan
---------------------------------------------------------------------------
Danny Vint

Specializing in Panoramic Images of California and the West
http://www.dvint.com

voice: 510-522-4703

When H.H. Bennett was asked why he preferred to be out
shooting landscapes rather than spending time in his portrait studio:

"It is easier to pose nature and less trouble to please."

http://www.portalwisconsin.org/bennett_feature.cfm

Current Thread