RE: pretty printer and PCDATA

Subject: RE: pretty printer and PCDATA
From: Tony Graham <tgraham@xxxxxxxxxxxxxxxx>
Date: Wed, 8 Sep 1999 10:48:37 -0400 (EST)
At 8 Sep 1999 12:02 +0200, Pieter Rijken wrote:
 > The (i think) easiest solution is to define in your dtd:
 > <!ELEMENT nmlist - - (#PCDATA)>      (NOT (#PCDATA)* !!)
 > <!ATTLIST nmlist ...>

There is NO DIFFERENCE between (#PCDATA) and (#PCDATA)*.

Since we're discussing XML, the text and production from the XML
Recommendation is:

3.2.2 Mixed Content

An element type has mixed content when elements of that type may
contain character data, optionally interspersed with child
elements. In this case, the types of the child elements may be
constrained, but not their order or their number of occurrences:

[58] Mixed::= '(' S? '#PCDATA' (S? '|' S? Name)* S? ')*'
              | '(' S? '#PCDATA' S? ')'

In XML terms, there is no distinction between (#PCDATA) and
(#PCDATA)*.  They're both valid models for mixed content.

There is no chunking of #PCDATA into words and spaces, etc.  If
characters aren't markup, they're character data.

You could take out all of the character data and the document would
still satisfy both the (#PCDATA) and (#PCDATA)* models, or you could
replace all printing characters with spaces and still satisfy the

 > and have in DSSSL the extra spaces trimmed off:
 > (element nmlist (process-children-trim))

This is correct.


Tony Graham
Tony Graham                            mailto:tgraham@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.      
17 West Jefferson Street                    Direct Phone: 301/315-9632
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
  Mulberry Technologies: A Consultancy Specializing in SGML and XML

 DSSSList info and archive:

Current Thread