[xsl] Assistance with position-based Table of Contents (TOC)‏

Subject: [xsl] Assistance with position-based Table of Contents (TOC)‏
From: "Michael Friedman" <michael.friedman@xxxxxxxxxxxxxxxxxx>
Date: Thu, 22 Oct 2009 10:22:23 -0500

I am working on a project where I am trying to generate a Table of Contents.
Bear with me, I know this has been done before and I've looked in the
archives and not found an example quite like this. My primary goal is to
produce a set of TOC lines that are indented based on
chapter/section/subsection level and to provide the sequential section
number in front of the title in the TOC. For example:

Goal:     01.02.03 Snickerdoodle Recipes

Where 01=chapter level, 02=section level, and 03=sub-section level.

The irritating factor in this is that my XML is not allowed to change from
it's current state and there are no IDs or attributes to key off of, AND,
the 2nd and third level elements are not consistent. That is, in the XML,
the structure could be like this:

<chapter chapnum="1">
    <title>Cooking with XML</title>
        <title>XML Cookies</title>

OR it could be:

<chapter chapnum="3"> <!-- chapter level -->
    <title>Baking with XPATH</title>
    <BakeDesc> <!-- section level -->
         <title>Crunching Number Cookies</title>
         <cookie> <!-- subsection level -->
             <title>Lemon Cookies</title>

Currently, I am doing a full pass through the XML using a <xsl:for-each>
that names each possible iteration of available elements in the DTD (there
are about 40 unique and sometimes overlapping elements in my project):
     <xsl:for-each select="//chapter | //section | //sect1 | //BakeDesc |

Then I determine the level of each matched element:
     <xsl:attribute name="level">
         <xsl:value-of select="count(ancestor-or-self::chapter |
ancestor-or-self::section | ancestor-or-self::sect1 |
ancestor-or-self::BakeDesc | ancestor-or-self::cookie">

...from which I can determine the level away from the root element, do some
calculations and create an indent.

Because the number of elements on which to match are growing as I move
through this 1000+ page document, it occurred to me it would be better to
ignore the actual names of the elements and create a TOC based on position
away from the root. I also need a way to create the section number based on
the position away from the root element, which I can't seem to currently do.

XPATH is not my strong suit, so I am turning to this forum to help me with
my deficiency. When I try to match <xsl:for-each select="//*"> against
everything, the $level attribute does not work the way I want it because the
result is always "1" since my context node is the root or chapter element. I
can't get a decent level or position read.

What I'd like to do:
1) Read through the XML <xsl:for-each select="//*"> or otherwise (without
maintaining a huge list of matches)
2) Extrapolate from the current node the chapter/section/subsection level to
produce: 01.02.03, etc and appropriate indents from a single node. If it's a
chapter node it would only have 01, a section node 01.02, and subsection
would be 01.02.03.
3) Not have to need to know what the elements are.
4) Discover if this is even feasible.

Thanks in advance for your big brains.

Michael Friedman
P.S. XSLT 1.0, Saxon 6.5.5 via Oxygen 11. Sometimes Arbortext 5.

Current Thread