Subject: Re: DAVENPORT: explanation of indexing From: Chris Maden <crism@xxxxxxxxxxx> Date: Fri, 24 Jul 1998 11:47:38 -0400 (EDT) |
My main problem with HTML indexing is that you can't append to a file; you make an entity and then you stop. And re-processing an entire book, then keeping all of the index terms in memory for sorting and collating, is unrealistic for me; changing one chapter would require reprocessing the entire book, which could be quite slow for some of my volumes (_UNIX Power Tools_, anyone?). I also have a constraint of merging the indices for multiple books, and presenting the information in a form our indexer can edit before merging (the word choice for the merged index might not be the same as for individual books). So what I do is generate a file of indexterms for every chapter when the HTML for the chapter is created. A term like this: <indexterm> <primary>writing</primary> <secondary>scripts</secondary> <tertiary>sed</tertiary> </indexterm> becomes writing scripts sed !SEDAWK:02592 ch04_01.htm 4. Writing sed Sc... A Perl script sorts and collates these chapter files (and can sort multiple book files into the collated index): writing from files (SORTAS: files from !SEDAWK:03891 ch05_11.htm 5.11. Reading and... to files (SORTAS: files to !SEDAWK:01038 ch02_03.htm#SEDAWK-CH-2-SECT-... !SEDAWK:03854 ch05_11.htm 5.11. Reading and... !SEDAWK:08569 ch10_05.htm 10.5. Directing O... regular expressions !SEDAWK:01716 ch03_02.htm#SEDAWK-CH-3-SECT-... scripts !SEDAWK:00525 ch01_04.htm 1.4. Four Hurdles... awk !SEDAWK:04708 ch07_01.htm 7. Writing Script... sed !SEDAWK:02592 ch04_01.htm 4. Writing sed Sc... user-defined functions !SEDAWK:07940 ch09_03.htm 9.3. Writing Your... A final Perl script turns this stuff into HTML, split by letter of the alphabet: <DT><A NAME="writing">writing</A> <DD><DL> <DT>from files : <A HREF="../ch05_11.htm">5.11. Reading and Writing Files</A> <DT>to files <DD><DL> <DT><A HREF="../ch02_03.htm#SEDAWK-CH-2-SECT-3.2.1">2.3.2.1. Saving output</A> <DT><A HREF="../ch05_11.htm">5.11. Reading and Writing Files</A> <DT><A HREF="../ch10_05.htm">10.5. Directing Output to Files and Pipes</A> </DL> <DT>regular expressions : <A HREF="../ch03_02.htm#SEDAWK-CH-3-SECT-2.3">3.2.3. Writing Regular Expressions</A> <DT>scripts : <A HREF="../ch01_04.htm">1.4. Four Hurdles to Mastering sed and awk</A> <DD><DL> <DT>awk : <A HREF="../ch07_01.htm">7. Writing Scripts for awk</A> <DT>sed <DD><DL> <DT><A HREF="../ch07_01.htm">7. Writing Scripts for awk</A> <DT><A HREF="../ch04_01.htm">4. Writing sed Scripts</A> </DL> </DL> <DT>user-defined functions : <A HREF="../ch09_03.htm">9.3. Writing Your Own Functions</A> </DL> To make this relevant to DSSSList, here's the relevant stuff. It's uncommented, which is why I haven't released this stylesheet yet... -Chris (element chapter (sosofo-append (make-html-file (current-node)) (make-index))) (define (make-index) (make entity system-id: (string-append "idxtmp/" (gen-file-name (current-node))) (with-mode index (process-node-list (current-node))))) (mode index (default (process-node-list (node-list-filter (lambda (snl) (equal? (node-property 'class-name snl) 'element)) (children (current-node))))) (element indexterm (sosofo-append (process-children) (if (and (node-list-empty? (get-children-by-type (list (norm "see")) (current-node))) (not (attribute-string (norm "spanend")))) (make formatting-instruction data: (string-append " !" (attribute-string (norm "id") (ancestor (norm "book"))) ":" (format-number (all-element-number) "00001") " " (gen-file-name) (let* ((app (ancestor (norm "appendix"))) (chap (ancestor (norm "chapter"))) (gloss (ancestor (norm "glossentry"))) (nutentry (ancestor (norm "nutentry"))) (pref (ancestor (norm "preface"))) (sect1 (ancestor (norm "sect1"))) (sect2 (ancestor (norm "sect2"))) (sect3 (ancestor (norm "sect3"))) (container (if (node-list-empty? nutentry) (if (node-list-empty? sect3) (if (node-list-empty? sect2) (if (node-list-empty? sect1) (if (node-list-empty? chap) (if (node-list-empty? app) (if (node-list-empty? pref) (if (node-list-empty? gloss) (error (string-append "Unhandled <indexterm> location: " (gen-file-name))) gloss) pref) app) chap) sect1) sect2) sect3) nutentry))) (string-append (if (or (and (node-list-empty? sect1) (node-list-empty? gloss)) (node-list=? container nutentry) (and (node-list=? container sect1) (not (first-sibling? sect1)))) "" (string-append "#" (gen-id container))) " " (if (or (and (node-list-empty? chap) (node-list-empty? app)) (node-list=? container nutentry)) "" (string-append (if (node-list-empty? chap) (format-number (element-number app) "A") (number->string (element-number chap))) "." (if (not (node-list-empty? sect1)) (string-append (number->string (child-number sect1)) "." (if (not (node-list-empty? sect2)) (string-append (number->string (child-number sect2)) "." (if (not (node-list-empty? sect3)) (string-append (number->string (child-number sect3)) ".") "")) "")) "") " ")) (if (node-list-empty? gloss) (if (node-list=? container nutentry) (string-append "Chapter " (number->string (element-number (ancestor (norm "chapter")))) ", Reference: " (process-string (get-children-by-type (list (norm "term")) container))) (process-string (get-children-by-type (list (norm "title")) container))) (string-append (process-string (get-children-by-type (list (norm "title")) (ancestor (norm "glossary")))) ": " (process-string (get-children-by-type (list (norm "glossterm")) gloss)))))) " ")) (empty-sosofo)))) (element part (process-node-list (get-children-by-type (list (norm "docinfo") (norm "partintro") (norm "title"))))) (element primary (make formatting-instruction data: (string-append (process-string (current-node)) (let ((sortas (attribute-string (norm "sortas")))) (if sortas (string-append " (SORTAS: " sortas) "")) " "))) (element secondary (make formatting-instruction data: (string-append " " (process-string (current-node)) (let ((sortas (attribute-string (norm "sortas")))) (if sortas (string-append " (SORTAS: " sortas) "")) " "))) (element see (make formatting-instruction data: (string-append " " "(see " (process-string (current-node)) ") "))) (element seealso (make formatting-instruction data: (string-append " " "(see also " (process-string (current-node)) ") "))) (element tertiary (make formatting-instruction data: (string-append " " (process-string (current-node)) (let ((sortas (attribute-string (norm "sortas")))) (if sortas (string-append " (SORTAS: " sortas) "")) " ")))) DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: DAVENPORT: explanation of index, Mark Burton | Thread | IDREFS, Chris Maden |
Re: DAVENPORT: explanation of index, Mark Burton | Date | IDREFS, Chris Maden |
Month |