Re: Why Doesn't IE5 use the DTD to Validate?

Subject: Re: Why Doesn't IE5 use the DTD to Validate?
From: Chris Lilley <chris@xxxxxx>
Date: Fri, 02 Apr 1999 00:36:04 +0200

Didier PH Martin wrote:

> Chris I don't see your point here. Let's clarify from the processing point
> of view.
> a) Except for external references resolution, simply by knowing XML syntax
> rules, a parser can parse a XML document even without a DTD.

Um, fixed attributes?

> b) What is important for a browser is the construction of a flow object tree
> not the structural integrity of the original document.

My experience after a couple of years as chair of thwe CSS WG tells me
otherwise.  XSL also requires a correct parse tree. DOM works on the
source tree, not the result tree.

> Let's now take the second statement. The browser is in fact a translation
> tool. It has to convert from one tree (i.e. the document tree) into an other
> one (the rendition object tree). The first structure being a hierarchy and
> the second one too. Thus, for the browser what is important is to be able to
> recognize an element from its content. Then, to create a rendition object
> associated to this element and display the content.

OK so far

>  Actually browsers do
> have a rendition Domain Specific Language: HTML.

We went over this one vbefore; as you know, I assert that HTMl is a
really poor rendering language since it was not designed for that
purpose and both CSS and XSL FO give you far superior layout and
formatting control.

>  Thus, the second tree is composed of HTML objects.

No, it is not.

> Now, more particularly concerning new browsing engines like Gecko (Mozilla)
> or Form+ (Microsoft). The rendition object model is now being replaced by a
> CSS or DOM object. For example the paragraph object is replaced by the block
> object. 

Well, that has been the case ever since the CSS1 spec.

> XML ----> CSS tree object.
> I say CSS rendition objects because actually this is the model followed by
> both browsers. 

It is, but you should say CSS rendering objects also because that ias
what the CSS2 spec says. 

This is not CSS, this is CSS object model.

No, this is CSS.

> To conclude: The browser do not need to know necessarily if under a
> particular element there should be an other one. It is concerned by knowing
> to which rendition object it can associate this element. Its main job is to
> render :-)

Using selectors (in CSS) or patterns (in XSL) for which you need to know
what the exact element hierarchy is, what all the attributes are
(including fixed ones and ones in parameter entities), which attributes
are declared as ID, and so on.

> The browser is in fact an interpreter. Actually, the premises is that the
> published code (i.e. the XML document) is correct and validation (i.e.
> structural integrity check) occurred during the document's creation. 

Experience shows that this is a premise only if doing so confers an
immediate benefit. If it appears to work, ie if a browser claims to
validate but does not, or silently hides validation errors, then there
is no immediate benefit so people will not do it. Which leads, longer
term, to the situation where web sitesa can only accomodate a small
handful of browsers, all pages must be carefully tested on all those
browsers (version, platform, etc) and everything else is out of luck. A
situation in which the costs of designing a commercial website are
greatly increased. The current situation, in fact.

> This model could be changed by allowing the
> user to set a switch that requires that structural integrity is to be done
> when a DTD is present. This is something that may be required for e-commerce
> document.

Heh. What you are saying is, have some sort of switch in the document
which says whether the document is asserted to be valid or whether it is
just well formed?

But that of course already exists, and people can choose to make just
well-formed documents if they want. And people can choose to use
validating parser or just parsers which do well-formedness chacking.

There is only a problem when there is misrepresentation: when a document
asserts that it is valid but is not, or when a browser asserts that it
is validating when it is not.

> >From what I understood from certain position taken by this list members is
> that the default behavior should be that when a DTD is present there should
> be structural integrity validation. If not, no validation is done.

That is the intent of the XML 1.0 specification, yes.

> This
> seems to be a good mechanism and will probably included in 6.x browsers
> version. 

Itr was already included in IE5b2, actually, which is what all the fuss
is about. 

To be clear: this is the behaviour of all other validating parsers. Its
not rocket science, its not something for the future. And it is
something that users of IE5b2 were counting on continuing to exist in
the final version.



> The inclusion of a DTD could be interpreted as switch indicating to
> the interpreter that structural integrity check has to be done on the
> document.

Not "could be"; *is&*. That is the intent of the XML 1.0 spec. That is
what a validating parser does when encountering a document with a
doctype declaration and an internal subset with anything other than just
entity declarations.

--
Chris


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread