Re: underscores in XML tags

Subject: Re: underscores in XML tags
From: Chuck Darney <cdarney@xxxxxxxxxxxxxxxx>
Date: Tue, 08 Dec 1998 08:47:30 -0500

Linda van den Brink wrote:
> Hi all,
> I ( - still a newbie) am using Jade + DSSSL to convert a large number of
> well-formed XML files to HTML format. I'm having a problem with some of the
> tags I use because they contain the underscore ('_') character. I get
> parsing errors saying that this character is not allowed in start tags or
> end tags.
> However, according to the XML specification (sec 2.3) this character is
> allowed in XML tag names. Can I get Jade to accept it? I don't know how to
> configure Jade for XML input, but until I started using underscores I didn't
> have any problems...

This is the response I got from Tony Graham when I asked the same

...Chuck Darney

At 12 Feb 1998 09:34 -0500, Chuck Darney wrote:
 > Within my SGML instance I have several Text Entities referenced.  In
 > header I have an Entity:
 > <!ENTITY QUOTE_EXPIRE_DT "June 25, 1997">
 > Later in the instance I reference "QUOTE_EXPIRE_DT".  
 > The underscores are not valid characters in the entity name.  Is this
 > limitation with SGML or DSSSL?  Is there a way around it? I'm passing
 > these entity references from a Powerbuilder program and a SQL
 > (which doesn't care for dashes).

It's controlled by your SGML Declaration, and, if you don't supply one
when you're parsing, an SGML system will infer one for you.

The specific portion of the SGML Declaration of interest here is the
"naming rules".  Jade's default inferred SGML Declaration uses the
same naming rules as SGML's "Reference Concrete Syntax".  To allow
underscores in entity names (and other SGML names), you need to supply
an SGML Declaration that includes the underscore character.  Using the
DocBook SGML Declaration as an example, you need to add "_" to the
LCNMSTRT and UCNMSTRT parameters:

                LCNMSTRT "_"
                UCNMSTRT "_"
                LCNMCHAR ".-"
                UCNMCHAR ".-"
                        GENERAL YES
                        ENTITY  NO

This will allow underscores anywhere in names, including as the first
character.  You need to add it in two places because you are declaring
the uppercase and lowercase forms, which just happen to be the same.

You can reference your SGML Declaration by including it in the Jade
command line before the filename for your SGML file (or before your
DTD if also including the DTD filename in the command line).  You can
also reference an SGML Declaration to infer by using the "SGMLDECL"
keyword in your catalog file.  (See "charset.htm" from the nsgmls
distribution for more information on the catalog format.)  However,
I'm not sure what would happen if you referenced an SGML Declaration
that used name characters and quantities, etc., that conflicted with
the requirements for processing the DSSSL stylesheet DTD.


Tony Graham

 DSSSList info and archive:

Current Thread