RE: Why Doesn't IE5 use the DTD to Validate?

Subject: RE: Why Doesn't IE5 use the DTD to Validate?
From: "Sall, Ken" <ksall@xxxxxxx>
Date: Wed, 31 Mar 1999 19:52:16 -0500
Jonathan,

Thanks very much for your prompt and candid reply. I visited the MS Workshop
page,
http://msdn.microsoft.com/downloads/samples/internet/xml/xml_validator/defau
lt.asp
downloaded the demo, and it does indeed show that it is _possible_ to make
IE5 report validity errors when an XML doc doesn't conform to the DTD if
references.

However, the operative term here is "possible". My understanding of a
"validating parser" (which is what Microsoft says IE5 contains) is one that
_always_ reports validation problems.

Here's what you're expecting people to do to enable validation against a
DTD:

1) Author: Visit the above Workshop page to find the relevant JavaScript
code to set the validOnParse flag. Of course, they have to know about this
page. (And, BTW, I couldn't get any of my examples, valid or invalid, to
work using the online demo; I had to download it to get it to work for
anything but your weatherReport.xml. I got this msg online:
	Access is denied. 
	File: http://members.home.com/kensall/tests/collection1pubbugs.xml
	Line: 0, Position: 0, ErrorCode: 0x80070005)

2) Author: Must insert this JavaScript code in every document he wants
validated, or at least a reference to a *.js file. Big pain.

3) Author: Must be content with using non-standard MSDOM features to make
this work. No comment.

4) End User: Must have JavaScript Enabled on. Big assumption here.

I'd really appreciate hearing what options you considered and discarded for
allowing developers (I would say "end users") to turn on validation, because
I don't see how the XML validation option is much different than whether a
user wants Java or JavaScript enabled. And I would argue that the _default_
should be Validation Enabled, if you claim to have a validating parser.

How can XML ever become a useful data exchange mechanism if one can't easily
verify that the data conforms to the schema? If I wrote an XML document and
_didn't_ want validation, I just wouldn't reference a DTD; I'd have to be
satisfied with well-formedness. OTOH, if as an author, I specify a DTD, I
want to be sure I'm using the language correctly, so I'd want the validation
errors. No fuss, no muss.

Furthermore, let's say I decide I want to use the standard Foobar.DTD but
I'm not the author or even a contributor to the DTD. I'm not even on the
foo-dev mailing list :-) So I study the DTD, write my XML doc, ref the DTD,
and want to check that it is correct. I can't. (Let's assume I don't know
about the Magic Javascript validOnParse Enabler Ring.) Then, suppose the Foo
Standards Group changes Foobar.DTD in a way that my XML doc shouldn't be
valid. I wouldn't even know there was a change because, unless I go through
hoops, I can't tell if my doc is valid. Don't you think this is a
fundamental problem?

-----------
If you really think what Microsoft has implemented is a "validating parser",
then would you please comment on these quotes from the XML 1.0
Recommendation and, following that, a few points from Tim Bray's Annotated
XML spec.? Thanks.
-----------
5.1 Validating and Non-Validating Processors

 [stuff deleted] 

"[Definition:] Validating processors must report violations of the
constraints expressed by the declarations in the DTD, and failures to
fulfill the validity constraints given in this specification. To
accomplish this, validating XML processors must read and process
the entire DTD and all external parsed entities referenced in the
document. 

Non-validating processors are required to check only the document
entity, including the entire internal DTD subset, for well-formedness."
----
5.2 Using XML Processors

The behavior of a validating XML processor is highly predictable; it
must read every piece of a document and report all well-formedness
and validity violations. Less is required of a non-validating
processor; it need not read any part of the document other than the
document entity.
----

http://www.xml.com/axml/testaxml.htm

	Tim Bray's annotations:

"Validity Is Not An Option

XML evangelists, such as myself,
take great glee in pointing out that
XML, unlike SGML, has no optional
features; the result, we claim
triumphantly, is that any XML
processor in the world should be
able to read any XML document in
the world (well, modulo character
encoding issues).

"Aha!" claim some ungrateful
doubting Thomases; "XML
distinguishes well-formedness
and validity, and that's an option!"

Wrong. Anything that's
well-formed is an XML document,
and any XML processor has to be
able to read any well-formed
document. If a document wants to
aspire to the higher karmic plane
of validity, well good on it, but
that's an extra, not an optional
feature of XML."

	"Both Validating And Non-Validating

Some members of the first
generation of XML processors
actually are both validating and
non-validating. Usually, they have
some way to turn the validating
behavior on and off. Nothing in the
spec rules this out, and it seems
to be useful.

In a similar vein, note that while a
non-validating processor doesn't
have to read anything but the
document entity, nothing forbids
this, and in fact, it's pretty easy to
do it, from the programming point
of view. So many of those
first-generation processors will
read external entities, even if not
validating; the good ones also
include a switch so you can turn
this behavior on and off;
sometimes you absolutely don't
want to read external entities, for
performance reasons."

Thanks for reading this.

- Ken Sall                           ksall@xxxxxxx, kensall@xxxxxxxx
- Century Computing, Inc.            http://www.cen.com/
- NG-HTML: Next Generation HTML      http://www.cen.com/ng-html/
- XML at Web Developers Virtual Lib
http://WDVL.com/Authoring/Languages/XML/
- MW3: Motif on the World Wide Web   http://www.cen.com/mw3/

> -----Original Message-----
> From: Jonathan Marsh [mailto:jmarsh@xxxxxxxxxxxxx]
> Sent: Wednesday, March 31, 1999 5:03 PM
> To: 'xsl-list@xxxxxxxxxxxxxxxx'
> Subject: RE: Why Doesn't IE5 use the DTD to Validate?
> 
> 
> This is as designed, not a bug.
> 
> The IE5 XML parser is a validating parser, with two 
> properties set through
> DOM extensions to control DTD handling:
>  - validateOnParse determines whether validation errors are 
> presented to the
> user.
>  - resolveExternals determines whether the DTD or XML Schema 
> is loaded and
> datatypes, default values, etc. are honored.
> 
> The values of these properties when browsing directly to XML 
> documents is
> validateOnParse=false and resolveExternals=true.
> 
> When browsing XML documents on the Web, surfacing validation 
> errors is of
> little apparent value.  I would not expect publishers to 
> author both a DTD
> or XML Schema and documents that don't conform to that 
> DTD/Schema.  So the
> vast majority will not generate validation errors.  For those 
> that declare a
> DTD and are invalid, is it no better to give the user a 
> validation error
> instead of displaying the document, in fact the validation error could
> prevent the user from viewing an otherwise perfectly good 
> document.  Also
> the performance penalty for validation is significant and 
> should not be
> imposed on end-users without good reason.
> 
> The only scenario we could come up with where validation is 
> useful when
> browsing XML documents is when the browser is used as a 
> development tool,
> allowing easy checking of well-formedness and validation for 
> a document in
> progress.  This scenario can be accomplished by a number of 
> alternative
> mechanisms without impacting the browsing experience - a 
> simple tool that
> validates an XML document could be written in a few lines of 
> JavaScript, see
> http://msdn.microsoft.com/downloads/samples/internet/xml/xml_v
> alidator/defau
> lt.asp for an example.
> 
> We considered several mechanisms for allowing developers to "turn on"
> validation errors but did not find a clean solution that could be
> implemented in time for the IE5 release.
> 
> - Jonathan Marsh
> 
> > -----Original Message-----
> > From: Sall, Ken [mailto:ksall@xxxxxxx]
> > Sent: Wednesday, March 31, 1999 6:37 AM
> > To: 'xsl-list@xxxxxxxxxxxxxxxx'
> > Subject: RE: Why Doesn't IE5 use the DTD to Validate?
> > 
> > 
> > Thanks, Stephen.
> > I've added an example that illustrates your point that IE5 
> detects DTD
> > syntax errors.
> > 
> > http://members.home.com/kensall/tests/collection1bugsdtd.xml
> > http://members.home.com/kensall/tests/collection1bugs.dtd
> >  
> > However, if anyone from Microsoft can explain why IE5 doesn't 
> > actually use
> > the DTD to validate the document (the way that IE5 Beta 2 did), I'd
> > appreciate it. This problem will be published in an article 
> > shortly (in the
> > larger context of positive things you can do with IE5 with 
> > XML/XSL) and it
> > would be great to state correctly what Microsoft plans w.r.t. DTD
> > processing. 
> > 
> > TIA
> > - Ken Sall                           ksall@xxxxxxx, kensall@xxxxxxxx


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread