Subject: RE: processing character entities From: "Steffen Heinrich" <heinrich@xxxxxxxxxxxx> Date: Tue, 20 Jul 1999 13:39:04 +0100 |
Boris Goldowsky asked: > >What are the various solutions people on this list use for >processing character entities in SGML->SGML or SGML->HTML >conversions? In my work I translate a lot of SGML containing >entities for foreign characters, math symbols, etc. into HTML. Some >get turned into HTML entities, some are dumbed down to ASCII, and >others get turned into inline graphics. > Hello Boris, the approach that I take to tackle the very same problem consists of three different parts: 1. Top of the DTD, before the declaration of any other entities a) on validating: <!ENTITY % qartchars SYSTEM "qartchars"> <!--%qartchars;--> <!ENTITY % e.sup "" > ... <!ENTITY % ISOlat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN"> %ISOlat1; ... <!ENTITY % b.float "fachidx | fussnt | verw | xverw | f | dfref |produkt | unklar %e.sup;" -- BODY.floats --> %qartchars; is excluded. The ordinary float elements are allowed only. b) on DSSSL transformation: <!ENTITY % qartchars SYSTEM "qartchars"> %qartchars; <!ENTITY % e.sup "" > ... <!ENTITY % ISOlat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN"> %ISOlat1; ... <!ENTITY % b.float "fachidx | fussnt | verw | xverw | f | dfref |produkt | unklar %e.sup;" -- BODY.floats --> %qartchars; gets included and its content takes precedence over any following declarations. You could also change the catalog or simply use a different catalog file that points to other entity files on transformation than the one you use on validating. 2. The qartchars-entity: <!-- The following Elements will be appended to the 'anywhere' - float content. --> <!ENTITY % e.sup " | FONT | CHARREF | IMG" > <!-- Using the HTML FONT-tag. --> <!ELEMENT FONT - - (#PCDATA) > <!ATTLIST FONT size NUMBER #IMPLIED face CDATA #IMPLIED> <!-- Mapping character references to themselves or to character codes. --> <!ELEMENT CHARREF - o EMPTY > <!ATTLIST CHARREF cname CDATA #IMPLIED -- used if present -- value NUMBER #REQUIRED -- else uses code reference --> <!-- Using the HTML IMG-tag. --> <!ELEMENT img - o EMPTY> <!ATTLIST IMG src CDATA #REQUIRED alt CDATA #IMPLIED align CDATA #IMPLIED border NUMBER #IMPLIED > <!--Examples of SDATA entities to be overridden. --> <!ENTITY aring "<CHARREF CNAME='aring' VALUE='229'>" > <!ENTITY bdquo "<CHARREF VALUE='132'>" > <!ENTITY ldquo "<CHARREF VALUE='147'>" > <!ENTITY quot "<CHARREF CNAME='quot' VALUE='22'>" > <!ENTITY lt "<CHARREF CNAME='lt' VALUE='60'>" > <!ENTITY ap "<FONT FACE=Symbol>»</FONT>" > <!ENTITY rArr "<FONT FACE=Symbol>Þ</FONT>" -- double right arrow --> <!ENTITY rarr "<FONT FACE=Symbol>®</FONT>" -- right arrow--> <!ENTITY alpha "<FONT FACE=Symbol>a</FONT>" > <!ENTITY beta "<FONT FACE=Symbol>b</FONT>" > <!ENTITY sigma "<FONT FACE=Symbol>s</FONT>" > <!ENTITY tau "<FONT FACE=Symbol>t</FONT>" > <!ENTITY Delta "<FONT FACE=Symbol>D</FONT>" > <!ENTITY xover '<IMG SRC="../entities/x_ov.gif" ALT="x overscore">' > <!ENTITY yover '<IMG SRC="../entities/y_ov.gif" ALT="y overscore">' > <!ENTITY zover '<IMG SRC="../entities/z_ov.gif" ALT="z overscore">'> ... you get the idea... 3. The dsl script: (element FONT (make element gi: "FONT" attributes: (copy-attributes) (process-children))) (element IMG (make empty-element gi: "IMG" attributes: (copy-attributes))) (element CHARREF (make entity-ref name: (if (attribute-string "CNAME") (attribute-string "CNAME") (string-append "#" (attribute-string "VALUE"))))) The construction rules copy FONT and IMG elements to the HTML output, while processing of CHARREF elements is determined by the presence of the CNAME attribute. This works very well and I find it more satisfying than the choice given between general SDATA-mapping without possibility to take influence or general SDATA-preservation. Still, I'd love to hear about the workarounds that are used by others. Regards, Steffen --------- steffen heinrich, berlin, germany "When you're chewing on life's gristle Don't grumble, give a whistle And DSSSL helps things turn out for the best..." (Monty Python overheard) DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: processing character entities, Brandon Ibach | Thread | Re: processing character entities, Boris Goldowsky |
Re: jade or DSSSL: either one is st, Matthias Clasen | Date | DSC 2.0 released: includes DSSSL t, Henry S. Thompson |
Month |