|
Subject: Re: DAVENPORT: explanation of indexing From: Chris Maden <crism@xxxxxxxxxxx> Date: Fri, 24 Jul 1998 11:47:38 -0400 (EDT) |
My main problem with HTML indexing is that you can't append to a file;
you make an entity and then you stop. And re-processing an entire
book, then keeping all of the index terms in memory for sorting and
collating, is unrealistic for me; changing one chapter would require
reprocessing the entire book, which could be quite slow for some of my
volumes (_UNIX Power Tools_, anyone?).
I also have a constraint of merging the indices for multiple books,
and presenting the information in a form our indexer can edit before
merging (the word choice for the merged index might not be the same as
for individual books).
So what I do is generate a file of indexterms for every chapter when
the HTML for the chapter is created. A term like this:
<indexterm>
<primary>writing</primary>
<secondary>scripts</secondary>
<tertiary>sed</tertiary>
</indexterm>
becomes
writing
scripts
sed
!SEDAWK:02592 ch04_01.htm 4. Writing sed Sc...
A Perl script sorts and collates these chapter files (and can sort
multiple book files into the collated index):
writing
from files (SORTAS: files from
!SEDAWK:03891 ch05_11.htm 5.11. Reading and...
to files (SORTAS: files to
!SEDAWK:01038 ch02_03.htm#SEDAWK-CH-2-SECT-...
!SEDAWK:03854 ch05_11.htm 5.11. Reading and...
!SEDAWK:08569 ch10_05.htm 10.5. Directing O...
regular expressions
!SEDAWK:01716 ch03_02.htm#SEDAWK-CH-3-SECT-...
scripts
!SEDAWK:00525 ch01_04.htm 1.4. Four Hurdles...
awk
!SEDAWK:04708 ch07_01.htm 7. Writing Script...
sed
!SEDAWK:02592 ch04_01.htm 4. Writing sed Sc...
user-defined functions
!SEDAWK:07940 ch09_03.htm 9.3. Writing Your...
A final Perl script turns this stuff into HTML, split by letter of the
alphabet:
<DT><A NAME="writing">writing</A>
<DD><DL>
<DT>from files
: <A HREF="../ch05_11.htm">5.11. Reading and Writing Files</A>
<DT>to files
<DD><DL>
<DT><A HREF="../ch02_03.htm#SEDAWK-CH-2-SECT-3.2.1">2.3.2.1. Saving output</A>
<DT><A HREF="../ch05_11.htm">5.11. Reading and Writing Files</A>
<DT><A HREF="../ch10_05.htm">10.5. Directing Output to Files and Pipes</A>
</DL>
<DT>regular expressions
: <A HREF="../ch03_02.htm#SEDAWK-CH-3-SECT-2.3">3.2.3. Writing Regular Expressions</A>
<DT>scripts
: <A HREF="../ch01_04.htm">1.4. Four Hurdles to Mastering sed and awk</A>
<DD><DL>
<DT>awk
: <A HREF="../ch07_01.htm">7. Writing Scripts for awk</A>
<DT>sed
<DD><DL>
<DT><A HREF="../ch07_01.htm">7. Writing Scripts for awk</A>
<DT><A HREF="../ch04_01.htm">4. Writing sed Scripts</A>
</DL>
</DL>
<DT>user-defined functions
: <A HREF="../ch09_03.htm">9.3. Writing Your Own Functions</A>
</DL>
To make this relevant to DSSSList, here's the relevant stuff. It's
uncommented, which is why I haven't released this stylesheet yet...
-Chris
(element chapter
(sosofo-append (make-html-file (current-node))
(make-index)))
(define (make-index)
(make entity
system-id: (string-append "idxtmp/"
(gen-file-name (current-node)))
(with-mode index
(process-node-list (current-node)))))
(mode index
(default (process-node-list (node-list-filter (lambda (snl)
(equal? (node-property 'class-name
snl)
'element))
(children (current-node)))))
(element indexterm
(sosofo-append (process-children)
(if (and (node-list-empty? (get-children-by-type (list (norm "see"))
(current-node)))
(not (attribute-string (norm "spanend"))))
(make formatting-instruction
data: (string-append " !"
(attribute-string (norm "id")
(ancestor (norm "book")))
":"
(format-number (all-element-number)
"00001")
" "
(gen-file-name)
(let* ((app (ancestor (norm "appendix")))
(chap (ancestor (norm "chapter")))
(gloss (ancestor (norm "glossentry")))
(nutentry (ancestor (norm "nutentry")))
(pref (ancestor (norm "preface")))
(sect1 (ancestor (norm "sect1")))
(sect2 (ancestor (norm "sect2")))
(sect3 (ancestor (norm "sect3")))
(container (if (node-list-empty? nutentry)
(if (node-list-empty? sect3)
(if (node-list-empty? sect2)
(if (node-list-empty? sect1)
(if (node-list-empty? chap)
(if (node-list-empty? app)
(if (node-list-empty? pref)
(if (node-list-empty? gloss)
(error (string-append "Unhandled <indexterm> location: "
(gen-file-name)))
gloss)
pref)
app)
chap)
sect1)
sect2)
sect3)
nutentry)))
(string-append (if (or (and (node-list-empty? sect1)
(node-list-empty? gloss))
(node-list=? container
nutentry)
(and (node-list=? container
sect1)
(not (first-sibling? sect1))))
""
(string-append "#"
(gen-id container)))
" "
(if (or (and (node-list-empty? chap)
(node-list-empty? app))
(node-list=? container
nutentry))
""
(string-append (if (node-list-empty? chap)
(format-number (element-number app)
"A")
(number->string (element-number chap)))
"."
(if (not (node-list-empty? sect1))
(string-append (number->string (child-number sect1))
"."
(if (not (node-list-empty? sect2))
(string-append (number->string (child-number sect2))
"."
(if (not (node-list-empty? sect3))
(string-append (number->string (child-number sect3))
".")
""))
""))
"")
" "))
(if (node-list-empty? gloss)
(if (node-list=? container
nutentry)
(string-append "Chapter "
(number->string (element-number (ancestor (norm "chapter"))))
", Reference: "
(process-string (get-children-by-type (list (norm "term"))
container)))
(process-string (get-children-by-type (list (norm "title"))
container)))
(string-append (process-string (get-children-by-type (list (norm "title"))
(ancestor (norm "glossary"))))
": "
(process-string (get-children-by-type (list (norm "glossterm"))
gloss))))))
"
"))
(empty-sosofo))))
(element part
(process-node-list (get-children-by-type (list (norm "docinfo")
(norm "partintro")
(norm "title")))))
(element primary
(make formatting-instruction
data: (string-append (process-string (current-node))
(let ((sortas (attribute-string (norm "sortas"))))
(if sortas
(string-append " (SORTAS: "
sortas)
""))
"
")))
(element secondary
(make formatting-instruction
data: (string-append " "
(process-string (current-node))
(let ((sortas (attribute-string (norm "sortas"))))
(if sortas
(string-append " (SORTAS: "
sortas)
""))
"
")))
(element see
(make formatting-instruction
data: (string-append " "
"(see "
(process-string (current-node))
")
")))
(element seealso
(make formatting-instruction
data: (string-append " "
"(see also "
(process-string (current-node))
")
")))
(element tertiary
(make formatting-instruction
data: (string-append " "
(process-string (current-node))
(let ((sortas (attribute-string (norm "sortas"))))
(if sortas
(string-append " (SORTAS: "
sortas)
""))
"
"))))
DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: DAVENPORT: explanation of index, Mark Burton | Thread | IDREFS, Chris Maden |
| Re: DAVENPORT: explanation of index, Mark Burton | Date | IDREFS, Chris Maden |
| Month |