processing character entities

Subject: processing character entities
From: Boris Goldowsky <boris@xxxxxxxxxxxxxxxxxxxx>
Date: Mon, 19 Jul 1999 12:49:32 -0400 (EDT)
The recent discussion of the query construction rule reminded me of an
old dilemma.

What are the various solutions people on this list use for processing
character entities in SGML->SGML or SGML->HTML conversions? In my work
I translate a lot of SGML containing entities for foreign characters,
math symbols, etc. into HTML.  Some get turned into HTML entities,
some are dumbed down to ASCII, and others get turned into inline
graphics.

In the absence of QUERY, there's no obvious way to write a rule to
deal with these.  I know of two workaround solutions:

1. Preprocess entities into elements, modify your DTD to allow these
elements anywhere, then use an ELEMENT rule.

2. Write a function that does the equivalent of process-children,
except it also scans PCDATA for entities and process them.  Use this
function everywhere you would normally use process-children.

Workaround #1 makes the DTD ugly, and can cause parsing problems when
omitted tags are involved.  #2 makes the DSSSL ugly, and is likely to
be extremely expensive since it forces jade to look at characters as
individual nodes rather than just strings of text.  Are there any
other solutions that I'm overlooking?

Bng
--
Boris Goldowsky            Engineering & Development Manager
Information Please, LLC    boris@xxxxxxxxxxxxxxxxxxxx
www.infoplease.com         617 832-0324


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread