[xsl] Transforming HTML Help contents (.hhc) file into Aurigma Deep Tre e TOC .htm files

Subject: [xsl] Transforming HTML Help contents (.hhc) file into Aurigma Deep Tre e TOC .htm files
From: Graham Hannington <Ghannington@xxxxxxx>
Date: Fri, 5 Sep 2003 16:40:55 +0100
I need help writing an XSLT stylesheet to transform an HTML Help contents
file (.hhc) into multiple Aurigma Deep Tree TOC .htm files
/DeepTree/Overview2.htm, with many thanks for Fedor Skvortsov for making
this code available... see the copyright notice in the "P.S." at the end of
this email).

I'm CCing the doxygen discussion list because I need this stylesheet
primarily for publishing, on the Web, doxygen-generated API references

I think that such a stylesheet might also be useful to many other people;
especially if it could easily be rewritten for other nested lists of table
of contents (TOC) nodes, similar to .hhc files (say, for DocBook).

I'd greatly appreciate any advice or assistance with this.

Why do I need this stylesheet?

I have many publications whose TOCs - just the plain, ASCII text of the
headings, without any markup - is over a megabyte. I want to present these
publications on a website, with the TOC displayed as an expandable nested
list, similar to the Contents pane in the Microsoft HTML Help (.chm) viewer.
I already do this on an intranet site, using the HTML Help ActiveX control
in an HTML frameset to display the .hhc, but the size of these TOCs makes
this method impractical over a slower Web connection. The TOCs are simply
too large to send over the Web in one chunk.

For nearly two years, I've wanted to emulate the Microsoft Developer Network
(MSDN) Library website (http://www.msdn.microsoft.com/library/default.asp),
which loads only the root TOC nodes by default, and thereafter loads TOC
nodes "on demand", when the user clicks an expandable node. As far as I can
tell, by rummaging around the site's source files, clicking an expandable
TOC node invokes an Internet Explorer-specific behavior (.htc) that
dynamically loads a TOC subtree, and inserts that subtree into the existing
displayed TOC. (If you use a browser other than IE - such as Netscape - then
the TOC is not quite as smart as this.) But I've never found the time to
"re-engineer" this.

Today, I stumbled across Aurigma Deep Tree (see the link at the top of this
email), which not only does most of what the MSDN Library website TOC does,
but it's cross-browser compatible, too. (Unfortunately, unlike the MSDN
Library TOC, the Aurigma solution is frame-based, and doesn't highlight the
current topic in the TOC, or synchronize the TOC with the topic.)

With Aurigma Deep Tree, each TOC subtree is defined in a separate HTML file,
and the nodes in each subtree are defined in those files as JavaScript array
elements. The Aurigma website describes all this in detail, but I'll give a
couple of examples here. A "folder" TOC node (that is, a node that contains
a subtree of child nodes) looks like this:

oNodes[1] = new node("Services", null , "folder", "main",

The node displayed as "Services" is an empty folder; "null" indicates that
it is not linked to an HTML topic (but if it did, it would be displayed in
the frame called "main"). When you click the node, it expands to show the
subtree defined in TOC/Services.htm.

An item (or, if you like, "leaf") node looks like this:

oNodes[2] = new node("Support", "Support.htm", "item", "main");

My thoughts so far

First, I'd planned to use the W3C Tidy tool to convert the .hhc into XHTML,
so that it can be transformed by an XSLT stylesheet. An .hhc is essentially
just an HTML document containing nested <ul> lists, where each TOC node is
defined inside an <li> element, like this:

	<object type="text/sitemap">
		<param name="Name" value="Topic heading"/>
		<param name="Local" value="topic.htm"/>
		<param name="ImageNumber" value="11"/>

The <param> elements and <li> elements aren't necessarily closed/well-formed
in a .hhc, hence the need for Tidy... and TOC nodes can be "empty": they
don't necessarily contain a <param name="local"> elements.

Then I'd write an XSLT template that would, say, iterate (for-each) over the
<li> elements inside an <ul>, build the necessary oNode[x] assignments, and
save them as a TOC/something_or_other.htm file, starting with TOC/Root.htm
(as required by the default Aurigma JavaScript).

And... that's about as far as I've got so far... I figured I could stagger
and stumble ahead on my own, stealing googled nuggets of XSLT where
possible, or I could see if anybody else out there could also see the merit
and general usefulness of such a thing (wouldn't it be nice if, for example,
not only HTML Help projects, but all DocBook-sourced publications could take
advantage of this? Or maybe - it wouldn't be the first time this has
happened to me - something like this is already widely available, and I just
don't know about it?

Of course, if someone already has a better method than Aurigma Deep Tree for
displaying massive TOCs on the Web, then I'd like to hear from them! (In
particular, it would be nice to have TOC syncing, and not to use frames...)

Looking forward to serving my multi-megabyte TOCs real soon now :-).

Graham Hannington

P.S. From the readme.txt supplied in the Aurigma Deep Tree downloadable zip:

 Aurigma Deep Tree 2.0 (c), Fedor Skvortsov, Aurigma Inc. 2001-2003
 Mailto: support@xxxxxxxxxxx
 WWW: http://www.aurigma.com

	This script and all its parts are free, specify this 
	text and copyright notice in the source code to use it.

and, similarly, from the main TOC.htm that contains the bulk of the

' Aurigma Deep Tree 2.0 (c), Fedor Skvortsov, Aurigma Inc. 2001-2003
' Mailto: support@xxxxxxxxxxx
' WWW: http://www.aurigma.com
' This script and his components are free, but must be
' maintained this text and the copyright to use it.

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread