Re: (dsssl) Practical Bibliography question

Subject: Re: (dsssl) Practical Bibliography question
From: "Markus Hoenicka" <hoenicka_markus@xxxxxxxxxxxxxx>
Date: Mon, 8 Oct 2001 00:31:54 -0500
I don't want to appear too intrusive, but I think the model that you
outline is actually too simplistic to be suitable for general use. It
might work for your current task (which is a good enough reason to
pursue this path), but if you look closer the problems creep in. I
have a background in biology/biochemistry/pharmacology, and the
formatting requirements for bibliographies in this field are far more
demanding than what you outlined. In most journals, citing a paper
needs authors, title, publication year, journal name, volume number,
issue number, start and end page. Citing a conference proceeding or a
chapter in a book adds editors, series editors, series title,
publishers and whatnot. Every journal has adopted its own rules for
sequence, formatting and punctuation. In-text citations can be
numerical in square brackets, numerical in angle brackets, numerical
in superscripts, or author-date (with varying number of authors cited,
of course). Multiple citations with adjacent numbers can be cited
explicitly or be folded into a range. First and subsequent citations
of a reference may be treated differently, e.g. first lists all
authors, subsequent list only the first author et al. All this is no
fun business, unfortunately.

While I admire your guts to implement this in DSSSL, I still think
DSSSL plus external preformatting is more suitable for this task. This
is not beautiful in any sense, but it appears to work. The strategy in
my RefDB package is like this (I use DocBook tag names, but I assume
TEI is not too different):

In-text citations use a citation element with one to many xref
elements. The latter specify the ID of the reference in an SQL
database. An additional xref element with a special attribute is used
in citations with more than one xref.

We have to write an XML document for each bibliography style (i.e. for
each supported journal) that contains all formatting and punctuation
rules for the in-text citations and the bibliography. These styles are
stored in a SQL database for easy access.

The references themselves are stored in another SQL database. They can
contain any additional information like keywords, notes, abstracts to
retrieve them easily.

First we use OpenJade to extract a list of all citation-related
xrefs. Their relation (sequence of the citations, sequence of xrefs
inside the same citation) is preserved. The resulting XML document is
fed to the bibliography tool which pulls the necessary references from
the SQL database, using the proper bibliography style. The tool
creates "cooked" bibliography entries containing bibliomset elements
with the bibliography data proper ("cooked" means it contains all
punctuation and similar characters which need to be
generated). Additional bibliomset elements are provided for multiple
citations. This way, multiple in-text citations can be displayed
either according to the bibliography style (e.g. as [1-3,5,7-10]) or
as individual citations ([1,2,3,5,7,8,9,10]). The latter case may be
wrong from the viewpoint of the bibliography style, but it preserves
the hyperlinks from the citation to the reference in a suitable output
format (HTML or PDF). The bibliography entries themselves (bibliomixed
elements) carry attributes to identify the database ID, the reference
type (journal, book, abstract, chapter etc), and a label for use as
the in-text citation.

The whole bibliography is written to a valid SGML document which can
be incorporated as an external entity into the original document.

The original document is then processed with the tweaked DocBook
stylesheets. They take care to specially format the RefDB
bibliographies. The in-text citations are pulled from the bibliography
entry labels via the xref mechanism. The bibliography itself is
formatted according to the values of up to 600 variables (in real
life, rarely more than a dozen are used at a time). These values, which
are based on the bibliography style, are exported by the bibliography
tool and are fed into OpenJade by a helper script. This dirty trick
frees us from having to provide one customized stylesheet per journal.

Apart from practical benefits in terms of processing this system has
other advantages, e.g. it separates the documents from the reference
data which simplifies reuse and sharing of these data.

Again, this system is ugly and does not come close to the beauty of a
pure DSSSL implementation, but it appears to work. An example DocBook
document processed with two completely different citation and
bibliography formats can be viewed at:
http://refdb.sourceforge.net/examples.html

regards,
Markus


Trent Shipley writes:
 > ============
 > 
 > The process has two major components:
 > 1 Synchonization
 > 2 Fomatting
 > 
 > =============
[...]

-- 
Markus Hoenicka
hoenicka_markus@xxxxxxxxxxxxxx
http://ourworld.compuserve.com/homepages/hoenicka_markus/


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist

Current Thread