|
Subject: [dssslist] New script to produce indexes with duplicates removed and ranges collapsed From: Jeremy Malcolm <Jeremy@xxxxxxxxxxxxx> Date: Sun, 12 Mar 2006 23:11:57 +0800 |
I have written a script to fix my problem with indexes coming up with ?
instead of page numbers. As you may recall, a couple of my indexes were
situated near the top rather than the bottom of my document, which meant
that the all-element-numbers (AENs) that openjade generated to keep
track of the page numbers for use in the print versions were thrown out
as soon as the completed indexes were inserted; a catch-22 situation.
My script, really just a modified version of one from the XSL
stylesheets called pdf2index, has the handy side-effect of achieving
something heretofore thought impossible with DSSSL: it removes duplicate
page references like "2, 2, 2" and it collapses ranges like "1, 2, 3"
into a nicer format like "1-3".
The pdf2index script is hackier than what we can achieve with DSSSL:
pdf2index uses a special stylesheet to generate a PDF format that can be
converted back to text using pdftotext (from xpdf) and parsed. In
contrast, my script which I've called aux2index.pl simply obtains the
page numbers from the .aux file that is generated by the last pass of
openjade. It then recreates the index file with those page numbers
"hard-coded" in so that it won't be corrupted with the AENs change.
The use of the script is as follows:
(a) Generate your PDF format, including the index/es, in the usual way,
generally with openjade to create a tex file and three passes of
pdfjadetex to turn it into a PDF. If you're like me and you have
one or more indexes that are above the text that you're indexing,
this will corrupt the index in the PDF format and you'll see ?
characters instead of page numbers. Never fear.
(b) Don't delete the .aux file that was generated by the last run of
pdfjadetex. Run aux2index.pl with two arguments: the .aux file
as the first argument and the index file as the second. For
example, "./aux2index.pl myfile.aux index.sgml > index.sgml.new".
If index.sgml.new seems OK, copy it back to index.sgml.
(c) Generate your PDF file again, again with pdfjade and three runs of
pdfjadetex (or however you normally do it). Hey presto, you will
have a nice index with no duplicates and with ranges collapsed.
The script is a quick hack, which is released to the public domain, but
it works for me. I don't trust the mailing list not to filter it out,
so for now it may be downloaded from
http://www.malcolm.id.au/files/software/unix/aux2index.pl (which will
also allow me to keep improving it over the next day or so).
Feel free to forward this message to any other appropriate developers,
lists or newsgroups if others might find it useful (I tried to join
docbook-apps in order to forward it there, but the mail server seems down).
--
Jeremy Malcolm LLB (Hons) B Com
Internet and Open Source lawyer, IT consultant, actor
host -t NAPTR 1.0.8.0.3.1.2.9.8.1.6.e164.org|awk -F! '{print $3}'
[demime 1.01d removed an attachment of type application/x-pkcs7-signature which had a name of smime.p7s]
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [dssslist] Re: Problems with ? , Jeremy Malcolm | Thread | [dssslist] how to produce page brea, JtoEE JtoME |
| Re: [dssslist] Re: Problems with ? , N. Raghavendra | Date | [dssslist] how to produce page brea, JtoEE JtoME |
| Month |