RE: [xsl] Assistance with position-based Table of Contents (TOC)‏

Subject: RE: [xsl] Assistance with position-based Table of Contents (TOC)‏
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 22 Oct 2009 16:35:36 +0100
I think you should be using a pipeline of transformations. The first
transformation should convert your "semantic XML" (BakeDesc, cookie etc)
into regular "document" XML (section/subsection/para etc), and then your
second transformation should be concerned with rendering (including
generation of the TOC). You are running into excess complexity because
you're trying to do everything at once: you need some separation of
concerns.

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay

> -----Original Message-----
> From: Michael Friedman [mailto:michael.friedman@xxxxxxxxxxxxxxxxxx]
> Sent: 22 October 2009 16:22
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Assistance with position-based Table of Contents (TOC)~
>
> Greetings,
>
> I am working on a project where I am trying to generate a
> Table of Contents.
> Bear with me, I know this has been done before and I've
> looked in the archives and not found an example quite like
> this. My primary goal is to produce a set of TOC lines that
> are indented based on chapter/section/subsection level and to
> provide the sequential section number in front of the title
> in the TOC. For example:
>
> Goal:     01.02.03 Snickerdoodle Recipes
>
> Where 01=chapter level, 02=section level, and 03=sub-section level.
>
> The irritating factor in this is that my XML is not allowed
> to change from it's current state and there are no IDs or
> attributes to key off of, AND, the 2nd and third level
> elements are not consistent. That is, in the XML, the
> structure could be like this:
>
> <chapter chapnum="1">
>     <title>Cooking with XML</title>
>     <section>
>         <title>XML Cookies</title>
>         <sect1>
>            <title>Snickerdoodles</title>
>         </sect1>
>     </section>
> </chapter>
>
> OR it could be:
>
> <chapter chapnum="3"> <!-- chapter level -->
>     <title>Baking with XPATH</title>
>     <BakeDesc> <!-- section level -->
>          <title>Crunching Number Cookies</title>
>          <cookie> <!-- subsection level -->
>              <title>Lemon Cookies</title>
>          </cookie>
>     </BakeDesc>
> </chapter>
>
> Currently, I am doing a full pass through the XML using a
> <xsl:for-each> that names each possible iteration of
> available elements in the DTD (there are about 40 unique and
> sometimes overlapping elements in my project):
>      <xsl:for-each select="//chapter | //section | //sect1 |
> //BakeDesc | //cookie">
>
> Then I determine the level of each matched element:
>      <xsl:attribute name="level">
>          <xsl:value-of
> select="count(ancestor-or-self::chapter |
> ancestor-or-self::section | ancestor-or-self::sect1 |
> ancestor-or-self::BakeDesc | ancestor-or-self::cookie">
>      </xsl:attribute>
>
> ...from which I can determine the level away from the root
> element, do some calculations and create an indent.
>
> Because the number of elements on which to match are growing
> as I move through this 1000+ page document, it occurred to me
> it would be better to ignore the actual names of the elements
> and create a TOC based on position away from the root. I also
> need a way to create the section number based on the position
> away from the root element, which I can't seem to currently do.
>
> XPATH is not my strong suit, so I am turning to this forum to
> help me with my deficiency. When I try to match <xsl:for-each
> select="//*"> against everything, the $level attribute does
> not work the way I want it because the result is always "1"
> since my context node is the root or chapter element. I can't
> get a decent level or position read.
>
> What I'd like to do:
> 1) Read through the XML <xsl:for-each select="//*"> or
> otherwise (without maintaining a huge list of matches)
> 2) Extrapolate from the current node the
> chapter/section/subsection level to
> produce: 01.02.03, etc and appropriate indents from a single
> node. If it's a chapter node it would only have 01, a section
> node 01.02, and subsection would be 01.02.03.
> 3) Not have to need to know what the elements are.
> 4) Discover if this is even feasible.
>
> Thanks in advance for your big brains.
>
> Michael Friedman
> P.S. XSLT 1.0, Saxon 6.5.5 via Oxygen 11. Sometimes Arbortext 5.

Current Thread