Subject: RE: About DSSSL 2 Specifications From: Tony Graham <tgraham@xxxxxxxxxxxxxxxx> Date: Sun, 15 Aug 1999 13:50:02 -0400 (EST) |
At 15 Aug 1999 10:50 -0400, Didier PH Martin wrote: > I may have done here two mistakes a) wrong example, b) bad interpretation of > the spec. Matthias, could you just give a small example to illustrate what > you are saying. We would all gain knowledge on how to use this feature. And > in the same vein, to document it without interpretation error. Thanks, for > fixing the error and thanks in advance providing a concrete example. If I may try in Matthias's place, the time-honoured example of a character in the document character set described with a minimum literal is "SGML User's Group Logo". From the SGML Declaration example in Section 15.1.1 of ISO 8879 (page 479 of the SGML Handbook): CHARSET BASESET "ISO 646-1983//CHARSET International Reference Version (IRV)//ESC 2/5 4/0" DESCSET 0 9 UNUSED 9 2 9 11 2 UNUSED 13 1 13 14 18 UNUSED 32 95 32 127 1 UNUSED BASESET "ISO Registration Number 109//CHARSET ECMA-94 Right Part of Latin Alphabet Nr. 3//ESC 2/9 4/3" DESCSET 128 32 UNUSED 160 5 32 165 1 "SGML User's Group Logo" 166 88 38 254 1 127 255 1 UNUSED This tells us many things (including that the character set references are out of date). This is a copy of the part of the SGML Declaration that describes the character set used in documents conforming to that SGML Declaration. For the most part, the character numbers (since you're only playing with numbers at this point) in the document character set are described in terms of character numbers in a known character set referenced in the BASESET portion. The lefthand column of numbers indicate character numbers in the document character set, the middle column of numbers indicate the extent of a range, and the third column of numbers, when present, indicates character numbers in the previous BASESET character set. For example, "9 2 9" indicates that the two characters in the document character set starting at character number 9 are the same as the two characters in the previous BASESET character set with character numbers starting at character number 9. The other two possibilities are that a character number in the document character set represents a non-SGML character, in which case it should be declared UNUSED in the description of the document character set, or a character number represents a character that isn't part of the character set referenced in the previous BASESET. When we can't describe a character number in terms of a character number in a known character set, we can use a "minimum literal" (i.e. string with a restricted range of allowed characters) to describe it. Hence the example: 165 1 "SGML User's Group Logo" where we're saying that character number 165 is something that we're describing as "SGML User's Group Logo". All of these machinations are for the benefit of the SGML parser recognising characters that are significant in markup. If you have character number 165 in your document, it's still character number 165 when it comes out of the parser, and it's not magically turned into the string "SGML User's Group Logo". You could also do: 165 50 "SGML User's Group Logo" (and modify the rest of the DESCSET accordingly) to declare 50 characters with this minimum literal. It doesn't matter what you do in the DESCSET provided that all of the characters that are significant in markup are accounted for once and once only. The DSSSL engine, since it also reads the SGML Declaration, can make the connection between character number 165 and the minimum literal "SGML User's Group Logo". The <literal-describe-char> mechanism in the DSSSL engine gets you from the mimimum literal to the character named "logoSGML". "logoSGML" is also used in the example <other-chars> declaration in Section 7.1.5 of the DSSSL standard. I don't know how you get from "logoSGML" to a character number in a font, but that's not today's question. If you need more information, Robin Cover (surprise, surprise) has a section about the SGML Declaration among his SGML/XML pages, plus there's a conference paper of mine about the CHARSET portion of the SGML Declaration at "http://www.mulberrytech.com/papers/docchar.htm". Regards, Tony Graham ====================================================================== Tony Graham mailto:tgraham@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9632 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: About DSSSL 2 Specifications, Didier PH Martin | Thread | RE: About DSSSL 2 Specifications, Didier PH Martin |
Re: Groves processing in OpenJade, Matthias Clasen | Date | RE: Groves processing in OpenJade, Didier PH Martin |
Month |