Re: Free beer and multiple special characters in XML

Subject: Re: Free beer and multiple special characters in XML
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Thu, 02 Sep 1999 14:01:56 +0100
Regan,

Probably no one picked up before because they just weren't sure what you
needed. But what with that offer for free beer you are bound to get many
responses.

Try putting a parameter entity declared with an external identifier,
pointing to your entity declarations file, in your internal subset; then
include the reference, like so:

<!DOCTYPE whatnot [
<!ENTITY % entitydecls SYSTEM "entitydecls.ent" >
%entitydecls;
]>
<whatnot>
&entity;
</whatnot>

The SYSTEM identifier takes a URI, so it could be a relative path. Any
entities declared in entitydecls.ent will be parsed in the document.

Be warned, however, that a validating parser will complain that your DTD
lacks declarations for elements. But a parser checking only for
well-formedness (such as is on most XSL processors) should pass it fine.

--Wendell

At 09:14 AM 9/2/99 -0700, you wrote:
>Hi,
>	No one answered my question last time so I thought I would spice it
>up. (However, I would be happy to buy anyone a glass of their favorite
>beverage if they were kind enough to help!)
>	I am looking for the best way to make sure that special characters
>are handled correctly in my XML documents, I don't know ahead of time what
>special characters will be entered, so the possibilities are huge. I was
>hoping there was a way to just refer to listings of character entities
>instead of having to include them in all my XML docs. I don't have a great
>understanding of this issue, so if the answer is obvious, please excuse my
>ignorance.
>
>-Regan
>
>-----Original Message-----
>From: regan@xxxxxxxxxxx [mailto:regan@xxxxxxxxxxx]
>Sent: Tuesday, August 31, 1999 6:26 PM
>To: xsl-list@xxxxxxxxxxxxxxxx
>Subject: multiple special characters in XML
>
>
>Hi all,
>	I have never had a good grasp of special character handling but was
>able to get by until now...
>	I hope some kind soul here will help with my current crises	
>	We have an application that takes sections of user generated HTML
>files, embeds these sections into a large XML file, then later, when
>requested, generates an HTML file from the XML and a XSL file (using XT).
>Our users have started introducing funny characters into the HTML (OK, what
>happens is they use Microsoft Word to introduce the funny characters and
>Word does the conversion to HTML, and we end up with "&eacute;" or some such
>in our HTML - then our XML)
>	To handle this example I can add <!ENTITY eacute "é"> to the XML
>header. But then if they add something else cute tomorrow I am stuck with
>bad XML again, until I add a new declaration. 
>	I could add all possible declarations now and have huge XML
>documents (We store 1000s of them)	Alternatively I could look for all
>the referenced entities and construct an appropriate header for each
>document, which seems like a lot of overhead when they seldom add such
>things.
>	But, are these the only answers? Surely there is a way to reference
>these special characters as they are listed in other documents available on
>the net? 
>	Please help if you can.
>
>Thanks,
>Regan Gill
>regan@xxxxxxxxxxx
>
>
> XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>
>
> XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>
>

======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread