sgml -> * output file name control

Subject: sgml -> * output file name control
From: Adam Di Carlo <adam@xxxxxxxxxxx>
Date: 18 Apr 1999 18:29:45 -0400
I'd like to raise the issue of control of the output file naming for
SGML -> * transformation.  This is mostly targetted for Norm Walsh.

Over the last two months, the Debian documentation group has worked
out a system for the output file naming of SGML files for different
output formats.  As a backgrounder, we're using a small, linuxdoc v1
derived DTD which is tuned to the requirements of our project.  For
instance, we wanted tags for packages and such.

[Note: in general, I would think it would be better to go with, say, a
simplified DocBook DTD with some customizations.  We're considering
(slowly) the idea of using a Debian-doc Formal Architecture which
might ease the transition.]

We took a hard look at Norm's system which he uses for output file
naming control.  Anyone can take a look at that system, in particular,
HTML file naming in Norm's docmentation.  Norm uses these variables:

  %html-ext%
  %use-id-as-filename%
  %root-filename% 

However, this scheme did not meet our requirements:

- must enable HTTP content negotiation, i.e., foo.en.html,
foo.de.html.  This allows us to serve up many different languages in a
single directory, and increases user convenience when browsing.  As an
example of this, see <URL:http://www.debian.org/releases/slink/>,
particularly *.html in
<URL:http://www.debian.org/releases/slink/i386/>.

- must enable a prefix for resulting HTML files, i.e.,
foo-chapt1title.html, foo-chapt2title.html

Moreover, we always force %use-id-as-filename% on.

On the downside, our scheme is a bit unix centric, notably in the
documentation(use of basename and dirname) and in the assumption that
directory separators are '/'.

Irregardless, I would be pleased if the people on this list considered
this scheme and if Norm considers whether to adopt this.

--
.....Adam Di Carlo....adam@xxxxxxxxxxxxxxxx<URL:http://www.onShore.com/>

From: Ardo van Rangelrooij <avrangel@xxxxxxxxxxx>
Subject: new version of DebianDoc-SGML with important changes
To: debian-doc@xxxxxxxxxxxxxxxx, debian-devel@xxxxxxxxxxxxxxxx
Date: 22 Mar 1999 15:58:07 -0500


I just uploaded a new version of the DebianDoc-SGML package for potato
to master.

The main new functionality consists of controlling the name of the
output files via command line options.  In the future this will also
be possible via corresponding so-called SGML Processing Instructions
in the SGML source files.  The following command line options are
provided:

  -c
  -b <basename>
  -t <topname>
  -e <extension>

The '-c' option turns on content-negotiation.  This means that the
output file name will contain an indication of the natural language of
the  document.  This indication is determined from the $LANG environment
variable and defaults to 'en'.  E.g., if $LANG = 'hr_HR' the indication
used will be 'hr'.

The last three options are used as follows for HTML (bash-like syntax
and with content-negotiation turned on):

    cnt = ".$(echo $LANG | cut -d'_' -f-1)"
    ext = ".${extension:-'html'}"

    if basename DOESN'T contain ANY '/' character:

        dir = "${basename}"

        directory = ${dir}${ext}
        topfile   = ${topname:-"index"}${cnt}${ext}
        chapters  = ${chapter-id}${cnt}${ext}
        footnotes = "footnotes"${cnt}${ext}

    elif basename DOES contain a '/' character:

        dir = "$(dirname ${basename})"
        pre = "$(basename ${basename})-"

        directory = ${dir}${ext}
        topfile   = ${pre}${topname:-"index"}${cnt}${ext}
        chapters  = ${pre}${chapter-id}${cnt}${ext}
        footnotes = ${pre}"footnotes"${cnt}${ext}

For the other formats the output file name becomes as follows (again
with content-negotiation turned on):

    ${dir}${cnt}${${format}_extension:-${format}_default}

Further, the default extensions of the generated output files have
been changed and are now as follows:

    HTML    -> .html
    LaTeX2e -> .tex (was .latex2e)
    Lout    -> .lout
    Texinfo -> .texinfo
    Text    -> .txt (was .text)
    TextOV  -> .tov (was .textov)

    DVI     -> .dvi
    PS      -> .ps (irrespective of intermediate format)
    PDF     -> .pdf (idem)
    Info    -> .info

The names of the scripts generating these have stayed the same.  The
scripts now also provide a '-h' option to print the help message.

Note that all this is currently not documented properly in the manual
page or the DebianDoc-SGML Markup Manual.  This will be fixed in one
of the next versions.

Thanks,
Ardo
-- 
Ardo van Rangelrooij
home email: avrangel@xxxxxxxxxxx, ardo@xxxxxxxxxx
home page:  http://www.tip.nl/users/ardo.van.rangelrooij
PGP fp:     3B 1F 21 72 00 5C 3A 73  7F 72 DF D9 90 78 47 F9


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread