RE: Why Doesn't IE5 use the DTD to Validate?

Subject: RE: Why Doesn't IE5 use the DTD to Validate?
From: "Didier PH Martin" <martind@xxxxxxxxxxxxx>
Date: Thu, 1 Apr 1999 10:01:51 -0500
HI Chris,

<YourComment>
> This is as designed, not a bug.

Well, IE5b2 did not have that dresign (or did not display that bug). I
was very pleased to see that IE5b2 did the right thing here.

> The IE5 XML parser is a validating parser, with two properties set through
> DOM extensions to control DTD handling:
>  - validateOnParse determines whether validation errors are presented to
the
> user.

So, it always validates, but the flag controls whether error messages
are shown? That sounds fine, until you realise that if a validating
parser founfd an error then not only do you have error messages, you
also have no parse tree. So, what gets displayed? Presumably, some
fixed-up, error-corrected tree. I expect that the error-correction is
not documented. So, back to the mess that HTML is in - no-one jknows
what the parse tree is. I fail to see how you can call this a feature.
</YourComment>

<Reply>
No it does not always validate. It parses the DTD an build an entity table.
This is (when validation is off) mainly to obtain external references. Then,
when the document is parsed, the parser do not check if a particular element
should be in that particular position within the tree. It just uses "well
formed" rule to recognize the element, associate it to a rendition object
and display. So, the whole process is:
parse DTD ----> Parse document --->  associate to rendition
object ----->display on layout
The interpreter do not check if the element's position within the source
tree is correct or not. It just assume that it is.

when a validation parsing process occurs
parse DTD ----> Parse document ---> Check structural integrity ----->
associate to rendition object ----->display on layout
The interpreter add an extra process to check structural intergrity

To validate if a document is structurally correct the interpreter has to
check within the entity tree or entity table if this element's position is
allowed. More than that. It has to check if this element occurrence is
allowed (check element quantity), if this element position is OK a) within
the tree, b) in the right order compared to its siblings, the element's
content is also checked for validity (most expensive operation).
Don't under estimate the time it takes to do these operations. So validation
has cost associated to it.

Parsing time for a non validation process is very rapid because XML syntax
rules are easy, and element completion rule present. Thus, to recognize an
element is very fast. When you add validation, you also add an extra
process.

Now, keep in mind the following facts:
a) XML technology will see a reduced performance compared to HTML technology
if you look at it from the rendition perspective. Why?

Because in HTML the rendition rules where part of the language. In XML it is
external. Why:

Today, HTML parsers do not even do SGML structural integrity check but more
a fast pattern match where syntax rules and encoded in....C or C++. In one
word it is canned within the interpreter. The process is then fast and
simple but not flexible:

parse HTML -----> Build layout objects ----> display

However, with DOM we have nearly the same process. The XML parser build a
DOM tree and DOM parameters are filled by a style sheet like CSS , DSSSL FO
or XSL FO. Even with that new construct we won't be much faster than today
here's why:

Today's process:

parse XML ----> HTML tree (HTML object model) ----> object properties
modified by style sheet

The new process :
XML ----> DOM tree (CSS object model) ----> objects properties modified by
style sheet

What is happening is not improved performances but replacement of a
rendition object model by a new one. The latter being more versatile tan the
former.

Should we add a supplementary step? I guess yes if the author included a DTD
(including more than external references), we can assume that the intention
is to have the document validated at the receiving end. Thus, the receiving
browser can turn the validation switch on when a DTD is present and the
validation switch off when a DTD is not. Can we expect version 5.x to
include this? Sincerely this is a lot to ask. XML this year is for you guys
"early adopters" but not mainstream yet. The next version ob browsers will
benefit from this year apprenticeship. So, we can expect 6.x versions to
improve the XML - CSS rendition object model (note here: I _am_ _not_
_talking_ _about_ _the_ _CSS_ language_ but_ the_ _rendition_ model_)

The problem with the DTD as a switch with validation is the following:
DTD could be used to contain only external references and no structural
information. Thus, if the author included only external references but not
structual information, we have a problem. If the DTD indicates to the
browser that its a switch to validate the document, then the document is
considered invalid. The author's intent where to just include external
references but not requiring validation at the reciving end.
This case implies that validation should be indicated by a different
mechanism. Any suggestions?
</Reply>

Regards
Didier PH Martin
mailto:martind@xxxxxxxxxxxxx
http://www.netfolder.com


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread