Re: dtd subset in Jade transformation

Subject: Re: dtd subset in Jade transformation
From: Daniel Speck <dspeck@xxxxxxxxxxxx>
Date: Thu, 05 Mar 1998 10:44:49 -0500
Sebastian Rahtz wrote:
Forgive me if someone has covered this, but if I take the following
SGML file:

<!doctype art PUBLIC "-//ES//DTD full length article DTD version
4.1.0//EN" [<!entity gr1 system "gr1" ndata image><!entity gr2 system
"gr2" ndata image><!entity gr3 system "gr3" ndata image>]><art
version="4.1.0".........

how I do, in Jade's transformation mode, preserve the

[<!entity gr1 system "gr1" ndata image>
 

You can't completely preserve the internal subset of the DTD using the current version of Jade because you don't have complete access to the DTD, however, you can preserve entity declarations that are made anywhere in the DTD (you can't distinguish the entities declared in the internal subset from those declared in the external subset). You can get to the entities by getting the "entities" property on the root of the document.

David Megginson's list of node properties in Jade (http://home.sprynet.com/sprynet/dmeggins/grove.html#SGMLDOC ) is very helpful. Also, Eliot Kimber's GroveView demo (http://www.isogen.com/demo/tools.html ) is helpful for visualizing the information contained in a grove. 

I have attached a small DSSSL script that will output _all_ of the entities declared in the DTD for an SGML document. You could filter out all non external entities easily.

-dan

--
Daniel Speck                              e-mail: dspeck@xxxxxxxxxxxx
Research Engineer                          voice:     +1 301.548.7818
Thomson Technology Services Group            fax:     +1 301.527.4094
1375 Piccard Drive, Rockville, MD 20850      WWW:    www.thomtech.com
 

<!doctype style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
<!ENTITY lt "<">
]>

(declare-flow-object-class element
  "UNREGISTERED::James Clark//Flow Object Class::element")

(declare-flow-object-class empty-element
  "UNREGISTERED::James Clark//Flow Object Class::empty-element")

(declare-flow-object-class document-type
  "UNREGISTERED::James Clark//Flow Object Class::document-type")

(declare-flow-object-class processing-instruction
  "UNREGISTERED::James Clark//Flow Object Class::processing-instruction")

(declare-flow-object-class entity
  "UNREGISTERED::James Clark//Flow Object Class::entity")

(declare-flow-object-class entity-ref
  "UNREGISTERED::James Clark//Flow Object Class::entity-ref")

(declare-flow-object-class formatting-instruction
  "UNREGISTERED::James Clark//Flow Object Class::formatting-instruction")

(declare-characteristic preserve-sdata?
  "UNREGISTERED::James Clark//Characteristic::preserve-sdata?" #t)

(define (grove-root #!optional (node (current-node)))
  (node-property 'grove-root node))

(root
    (sosofo-append
     (make formatting-instruction
       data: (string-append "&lt;!DOCTYPE " 
			    (gi (node-property 'docelem (current-node)))
			    " PUBLIC \"" "???" "\" [&#RE;"
			    (generate-entity-declarations (node-property "entities" (current-node)))
			    "]>&#RE;")
       )
     (empty-sosofo)))

(define (generate-entity-declarations entity-node-list)
  (let loop ((entity-node-list entity-node-list))
    (if (node-list-empty? entity-node-list)
	""
	(string-append
	 (generate-entity-declaration (node-list-first entity-node-list))
	 (loop (node-list-rest entity-node-list))))))

; (generate-entity-declaration entity-node)
;
; entity-node is a singleton node-list containing a node of class ENTITY
;
; (generate-entity-declaration) returns a string containing an SGML
; ENTITY declaration for the given entity.

(define (generate-entity-declaration entity-node)
  (let* ((entity-name (node-property "name" entity-node))
	 (entity-type (entity-type entity-name))
	 (entity-extid (node-property "external-id" entity-node default: #f))
	 (entity-notation (node-property "notation" entity-node default: #f))
	 )
    (string-append
     "<" "!ENTITY " entity-name 
     (if entity-extid
	 ; entity has an external identifier
	 (generate-external-entity-spec entity-extid entity-type entity-notation)
	 ; entity has no external identifier so it must be an internal entity
	 (generate-internal-entity-spec entity-type (node-property 'text entity-node))
	 )
     ">&#RE;")))

(define (generate-external-entity-spec extid type notation)
  (let ((sysid (node-property "system-id" extid default: #f))
	(pubid (node-property "public-id" extid default: #f)))
    (string-append
     (if pubid 
	 (string-append " PUBLIC \"" pubid "\"")
	 "")
     (if sysid 
	 (string-append
	  (if pubid
	      ""
	      " SYSTEM"
	      )
	  (string-append " \"" sysid "\"")
	  )
	 ""
	 )
     (case type
       ((cdata) (string-append " CDATA " (node-property 'name notation)))
       ((ndata) (string-append " NDATA " (node-property 'name notation)))
       ((sdata) (string-append " SDATA " (node-property 'name notation)))
       ((subdoc) " SUBDOC")
       (else (error (string-append "invalid entity type '"
				   (symbol->string type) 
				   "' encountered")))
       )
     )
    )
)

(define (generate-internal-entity-spec type text)
  (string-append
   (case type
     ((text) "")
     ((sdata) " SDATA")
     ((cdata) " CDATA")
     ((pi) " PI")
     (else (error (string-append "invalid entity type '"
				 (symbol->string type) 
				 "' encountered"))))
   " \""
   text
   "\""
   )
)


begin:          vcard
fn:             Daniel Speck
n:              Speck;Daniel
org:            Thomson Technology Services Group
adr:            1375 Piccard Drive;;Suite 250;Rockville;MD;20850;USA
email;internet: dspeck@xxxxxxxxxxxx
title:          Research Engineer
tel;work:       (301) 548-7818
tel;fax:        (301) 527-4094
x-mozilla-cpt:  ;0
x-mozilla-html: TRUE
version:        2.1
end:            vcard

Current Thread
  • RE: rule in page headers/footers, (continued)
    • Frank A. Christoph - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id EAA29913Thu, 5 Mar 1998 04:50:47 -0500 (EST)
      • Richard Light - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id FAA00531Thu, 5 Mar 1998 05:19:50 -0500 (EST)
        • Sebastian Rahtz - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id FAA00713Thu, 5 Mar 1998 05:37:33 -0500 (EST)
        • Sebastian Rahtz - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id FAA00781Thu, 5 Mar 1998 05:40:20 -0500 (EST)
        • Daniel Speck - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id KAA03912Thu, 5 Mar 1998 10:49:57 -0500 (EST) <=