|
Subject: RE: processing character entities From: "Steffen Heinrich" <heinrich@xxxxxxxxxxxx> Date: Tue, 20 Jul 1999 13:39:04 +0100 |
Boris Goldowsky asked:
>
>What are the various solutions people on this list use for
>processing character entities in SGML->SGML or SGML->HTML
>conversions? In my work I translate a lot of SGML containing
>entities for foreign characters, math symbols, etc. into HTML. Some
>get turned into HTML entities, some are dumbed down to ASCII, and
>others get turned into inline graphics.
>
Hello Boris,
the approach that I take to tackle the very same problem consists of
three different parts:
1. Top of the DTD, before the declaration of any other entities
a) on validating:
<!ENTITY % qartchars SYSTEM "qartchars">
<!--%qartchars;-->
<!ENTITY % e.sup "" >
...
<!ENTITY % ISOlat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN">
%ISOlat1;
...
<!ENTITY % b.float "fachidx | fussnt | verw | xverw
| f | dfref |produkt | unklar %e.sup;" -- BODY.floats -->
%qartchars; is excluded. The ordinary float elements are allowed
only.
b) on DSSSL transformation:
<!ENTITY % qartchars SYSTEM "qartchars">
%qartchars;
<!ENTITY % e.sup "" >
...
<!ENTITY % ISOlat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN">
%ISOlat1;
...
<!ENTITY % b.float "fachidx | fussnt | verw | xverw
| f | dfref |produkt | unklar %e.sup;" -- BODY.floats -->
%qartchars; gets included and its content takes precedence over any
following declarations.
You could also change the catalog or simply use a different catalog
file that points to other entity files on transformation than the one
you use on validating.
2. The qartchars-entity:
<!-- The following Elements will be appended to the 'anywhere' -
float content. -->
<!ENTITY % e.sup " | FONT | CHARREF | IMG" >
<!-- Using the HTML FONT-tag. -->
<!ELEMENT FONT - - (#PCDATA) >
<!ATTLIST FONT size NUMBER #IMPLIED
face CDATA #IMPLIED>
<!-- Mapping character references to themselves or to character
codes. -->
<!ELEMENT CHARREF - o EMPTY >
<!ATTLIST CHARREF cname CDATA #IMPLIED -- used if present --
value NUMBER #REQUIRED
-- else uses code reference -->
<!-- Using the HTML IMG-tag. -->
<!ELEMENT img - o EMPTY>
<!ATTLIST IMG src CDATA #REQUIRED
alt CDATA #IMPLIED
align CDATA #IMPLIED
border NUMBER #IMPLIED >
<!--Examples of SDATA entities to be overridden. -->
<!ENTITY aring "<CHARREF CNAME='aring' VALUE='229'>" >
<!ENTITY bdquo "<CHARREF VALUE='132'>" >
<!ENTITY ldquo "<CHARREF VALUE='147'>" >
<!ENTITY quot "<CHARREF CNAME='quot' VALUE='22'>" >
<!ENTITY lt "<CHARREF CNAME='lt' VALUE='60'>" >
<!ENTITY ap "<FONT FACE=Symbol>»</FONT>" >
<!ENTITY rArr "<FONT FACE=Symbol>Þ</FONT>"
-- double right arrow -->
<!ENTITY rarr "<FONT FACE=Symbol>®</FONT>" -- right arrow-->
<!ENTITY alpha "<FONT FACE=Symbol>a</FONT>" >
<!ENTITY beta "<FONT FACE=Symbol>b</FONT>" >
<!ENTITY sigma "<FONT FACE=Symbol>s</FONT>" >
<!ENTITY tau "<FONT FACE=Symbol>t</FONT>" >
<!ENTITY Delta "<FONT FACE=Symbol>D</FONT>" >
<!ENTITY xover '<IMG SRC="../entities/x_ov.gif"
ALT="x overscore">' >
<!ENTITY yover '<IMG SRC="../entities/y_ov.gif"
ALT="y overscore">' >
<!ENTITY zover '<IMG SRC="../entities/z_ov.gif"
ALT="z overscore">'>
...
you get the idea...
3. The dsl script:
(element FONT
(make element gi: "FONT"
attributes: (copy-attributes)
(process-children)))
(element IMG
(make empty-element gi: "IMG"
attributes: (copy-attributes)))
(element CHARREF
(make entity-ref name:
(if (attribute-string "CNAME")
(attribute-string "CNAME")
(string-append
"#"
(attribute-string "VALUE")))))
The construction rules copy FONT and IMG elements to the HTML output,
while processing of CHARREF elements is determined by the presence of
the CNAME attribute.
This works very well and I find it more satisfying than the
choice given between general SDATA-mapping without possibility to
take influence or general SDATA-preservation.
Still, I'd love to hear about the workarounds that are used by
others.
Regards, Steffen
---------
steffen heinrich, berlin, germany
"When you're chewing on life's gristle
Don't grumble, give a whistle
And DSSSL helps things turn out for the best..."
(Monty Python overheard)
DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: processing character entities, Brandon Ibach | Thread | Re: processing character entities, Boris Goldowsky |
| Re: jade or DSSSL: either one is st, Matthias Clasen | Date | DSC 2.0 released: includes DSSSL t, Henry S. Thompson |
| Month |