RE: Allowed characters in element id's (in The DSSSList Digest V2 #31)

Subject: RE: Allowed characters in element id's (in The DSSSList Digest V2 #31)
From: MARK.WROTH@xxxxxxxxxxx (Wroth, Mark)
Date: Fri, 1 May 1998 08:27:38 -0700

> ----------
> From:
> owner-dssslist-digest@xxxxxxxxxxxxxxxx[SMTP:owner-dssslist-digest@mulberry
> Reply To: 	dssslist@xxxxxxxxxxxxxxxx
> Sent: 	Thursday, April 30, 1998 11:32 PM
> To: 	dssslist-digest@xxxxxxxxxxxxxxxx
> Subject: 	The DSSSList Digest V2 #31
> The DSSSList Digest        Friday, May 1 1998        Volume 02 : Number
> 031
> Norman Walsh <ndw@xxxxxxxxxx> (marked >>), replying to 
> / Bill Raynor <braynor@xxxxxxx> (marked | ) was heard to say:
> | How do I determine what's allowed in an id= attribute? I'd like to use
> | underscores and periods but jade gets upset about extraneous punctuation
> | marks in the id. I'm generating the sgml from another system and would
> | like to take give the elements an easy to read name.
	>>The characters that are allowed in an ID are controlled by the
	>>SGML declaration.  If memory serves, IDs have to be NAMEs and
	>>NAMEs have to be composed of an LCNMSTRT or UCNMSTRT character
	>>followed by zero or more LCNMCHARs or UCNMCHARs.
	>>In other words, you can control the set of characters that are
	>>allowed to start a name and the set of characters that can be
	>>contained in a name (after the start) independently.
> | Is this something that can be changed, or is that not a wise thing to
> do?
	>>It can be changed in the declaration.  Off the top of my head,
	>>document interchange is the only problem that I forsee in
	>>broadening the characters allowed in a NAME.  But I accept that
	>>the top of my head is sometimes empty. ;-)

As it happens, I've had occasion to do this fairly extensively, albeit for a
different purpose (I needed entity references of the form "{a'}" vice
"&aacute;" for compatibility with a different application).  The name
character declarations were a bit fussy, and required accessing the
punctuation characters by their character numbers rather than their
literals.  Other than that, I've encountered no problems with the modified
declarations (once it was pointed out to me how to get Jade to *read* the
modified declaration :-).  

Since compatibility with a different convention (non-SGML based) was the
point of the change, I haven't seen any portability problems (yet) -- but I
hasten to add that the application is still in prototype.

(On a related [to my application] note: does anyone know how to get the RTF
backend to compose a character from a base character and a diacritical mark?
The test case is the letter "o" with a cedilla accent.  So far as I can
determine the only way to create it is by composing it from the base
character and accent, and I can't get that result from Jade. Of course, I
can't get it in MSWord itself, either, so it may be a limitation of the
target system, rather than Jade/DSSSL/SGML :-(.  The test case is correctly
set in TeX with "\c{o}", for whatever that's worth.)

 DSSSList info and archive:

Current Thread