Re: [xsl] Request for design tips: transforming XML topics + map to various XHTML + meta data formats

Subject: Re: [xsl] Request for design tips: transforming XML topics + map to various XHTML + meta data formats
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Fri, 20 Feb 2004 14:40:04 -0500
Hi Graham,

This is worth only $0.02, if that. (What does food for thought go for these days?) But the main development challenge (I think you'll find) in the kind of project you describe is not in organizing things for running a particular way, but in having a clear organization for maintenance and extension as you tweak what you're getting and add new formats and capabilities over time. If you handle this right, you actually preserve options for how the processes are invoked for a given run.

Because of this, and because of the way xsl:import works, you might think about starting by having separate stylesheet "shells" for your various particular output formats, which can call in a common "core" module (or modules) to perform the operations they have in common.

While this means that you will initially have separate stylesheets, and so invoke separate runs to get your various outputs, this is actually a big help during development (X stylesheet means X output, and Y means Y: very convenient; and whenever something is common to both, it goes into the core). Once you have it all working, after all, you can consolidate them back into an "uber-shell" that uses parameters to delegate the processing through separate modes (which can each, again, call out to a common core), whether those parameters are provided by the source document or at runtime. In the meantime, however, you want to avoid tripping over your own feet, and maintaining the different outputs in different stylesheets really helps with that.

I'm aware that this may go against your desire to minimize the number of tree traversals you will perform on any given run, but actually you need to sell me on that idea. Is there a reason, besides the aesthetic one of theoretical "efficiency", that you want to render your source several times in several formats all in a single pass? And if efficiency is a concern, shouldn't we be asking not how to do it in a single traversal, but how to do it most efficiently? :-> So yes, I'm questioning that requirement.

So it's not that you won't end up with a single sheet, or what appears as a single command with runtime options; it's just how to get from here to there amidst the plethora of various alike-but-not-alike output formats.


At 11:52 AM 2/20/2004, you wrote:
Let's say I have some DITA-source topics and a topic map. Or, roughly
equivalently, some XHTML files and an XML file that does roughly what a DITA
topic map does (list XHTML files in a nested hierarchy, with, say, a
"publication"-level title and other meta data).

I want to transform those "topics + map" source files into JavaHelp, or
Eclipse Help, or HTML Help 1.x, or MS Help 2, or... any format where the
content is (X)HTML (okay, perhaps HTML 3.2, or 4.01, if you're picky ;-),
with one or more format-specific meta data files for navigating that XHTML
(such as table of contents and index entries).

Typically, the format-specific meta data files consist of data gleaned from
both the XML topics and the topic map. For example, to create an HTML Help
1.x .hhc file (which is essentially just a set of nested unordered HTML
lists, where each list item contain a topic heading and, optionally, a topic
file path), you need to combine the TOC hierarchy and file paths from the
topic map with the heading text from each topic. And to create a .hhk
(keyword index) file, you need to use the topic map to collect index entries
defined in each topic.

I'd like to supply a topic map to a single XSLT stylesheet, and have that
XSLT stylesheet output, for example (for HTML Help 1.x):

- The topics referred to in the map, transformed to (X)HTML (or, if they
began life as XHTML, transformed to some simplified XHTML; without, for
example, development-environment specific <meta> elements)... I'm assuming
here a one-to-one relationship with the original topic source files and
these resulting (X)HTML files.
- .hhc
- .hhk
- .hhp

I'd really appreciate advice on the most efficient way to manage the
processing of all this stuff between the various templates in the XSLT style

For example (again, for HTML Help 1.x): to produce the .hhc, I need to parse
the input topic map and retrieve the headings from each topic. Easy: match
the topic map root element, traverse the topic map items, and use their
topic file paths in conjunction with the document function to get the
headings for each topic.  And the .hhp is just as straightforward. But I'm
trying to conceive of some way of accessing each topic once only, to extract
the heading text for the .hhc, the index keywords for the .hhk, and the
content for the simplified XHTML.

Can anyone point me to an example, or offer guidance on the XSLT style sheet
design? To recap: I want a *single* XSLT style sheet (okay; it will probably
import style sheets for each "component") that takes as input a topic map,
and outputs HTML files and format-specific meta files. I'd like the internal
processing to be streamlined, so that topics are only opened once (or am I
making it unnecessarily hard for myself, by making this a requirement?).

And finally... I know, this is probably asking too much... since the
transformations to Eclipse Help, HTML Help 1.x, or MS Help 2 all involve
pretty much the same sort of processing (such as merging map hierarchy with
topic titles), I'm wondering how hard it would be to have a generalized XSLT
stylesheet for creating "HTML plus format-specific meta data" from "topics
(containing contents amd some imbedded meta data) plus topic map (topic
hierarchy and some other meta data)". Where the output format is specified
as an XSLT parameter that acts as a switch to select format-specific
template processing... or even... the topic map itself specifies the desired
output formats, and the XSLT style sheet produces them all in a single
pass... (not iteratively: call XSLT stylesheet with output parameter set to
"Eclipse", then call it again with output parameter set to "HTML Help",
then...) Am I dreaming?

Graham Hannington

XSL-List info and archive:

"Thus I make my own use of the telegraph, without consulting
the directors, like the sparrows, which I perceive use it
extensively for a perch." -- Thoreau

XSL-List info and archive:

Current Thread