SUMMARY: XML Validation Issues (was: several threads)

Subject: SUMMARY: XML Validation Issues (was: several threads)
From: "Sall, Ken" <ksall@xxxxxxx>
Date: Sun, 4 Apr 1999 14:04:03 -0400
It seems useful to summarize some of the many issues generated directly or
indirectly by my original post, "Why Doesn't IE5 use the DTD to Validate?" 
and the spin-off threads "XML is broken" (XSL list), "Between raw and cooked
II:
Are? DTDs are just for validation" (sic; xml-dev), and "Is validity an
option?" 
(xml-dev).

  - What precisely is a validating parser obligated to do? What type of
parsing 
    behavior can XML authors _always_ expect from a "validating parser"? 

  - What aspects of XML processing are optional for a validating parser?

  - There is a major distinction between reading a DTD and validating,
    as James Clark states:

    	"Reading the DTD and validating aren't the same thing.  Unless a
document
     	has standalone="yes", the browser should always read a provided DTD
so
     	that it can correctly

	- default attributes
	- normalize attribute values
	- expand entity references

	None of these things involve validation."

  - What specific aspect of a DTD (e.g., the inclusion of ELEMENTS, not just
    entities) should signal to the parser that it must report validation 
    errors to the client?
 
  - Under what situations is a parser allowed to ignore external entities, 
    the external DTD subset, and by extension, entity and attribute
declarations 
    in the external DTD subset?

  - When is well-formedness sufficient and validation overkill?
 
  - Should a web browser with an XML parser require the document author to
enable
    validation via scripting, should validation be the default, or should
the 
    end user be able to toggle validation (and if so, how)? 

  - What is the desired behavior of a validating parser in different client 
    situations (browsing vs. EDI vs. databases, etc.)? 

  - Is there a need for a term to describe a parser that falls between 
    non-validating and validating? (Something that decries the behavior
common
    to AElfred, IBM's xml4j, Sun's Project X parser and Microsoft's parser.)

  - Should the XML spec be changed to ensure that both non-validating and 
    validating parsers produce the same parse tree, such as by specifying
    that external entities and default attributes etc. be expanded? 

  - What does the XML spec say specifically about these issues? Where is the

    spec lacking in terms of specificity? Conformance section ref:

    http://www.w3.org/TR/REC-xml#sec-conformance

  - Should the XML Schemas Working Group address some of the holes in the 
    XML spec, especially in terms of conformance? Should it be the job of
the
    Infoset WG? SAX2?

Anyone want to add to this list? More importantly, anyone want to take a 
crack summarizing what they believe to be the majority and/or minority
views?

- Ken Sall                           ksall@xxxxxxx, kensall@xxxxxxxx
- Century Computing Division         http://www.cen.com/
- AppNet, Inc.                       http://www.appnet.net/
- NG-HTML: Next Generation HTML      http://www.cen.com/ng-html/
- XML at Web Developers Virtual Lib
http://WDVL.com/Authoring/Languages/XML/
- MW3: Motif on the World Wide Web   http://www.cen.com/mw3/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread