Subject: RE: "Roots" of confusion introduced at W3C From: "DuCharme, Robert" <Robert.DuCharme@xxxxxxxxxx> Date: Wed, 20 Sep 2000 16:19:41 -0400 |
>Thus, for this foundational concept of the "root" of an XML document we find >multiple terms being, apparently, used for the same thing and certain terms >being used for more than one thing. An XML document can be represented as various kinds of trees, which serve different purposes, and any tree has a root that serves an important role in the use of that tree. There is no single concept of an XML document "root" that serves all purposes; the meaning of "root" depends on the type of tree representation of the XML document being discussed. When switching back and forth between XML-related specs, the difference in the types of trees being discussed can be confusing. I don't understand them all, but I do know the XML 1.0 spec pretty well, and it's much more internally consistent than you make it out to be. The XML 1.0 Rec says that documents have a physical structure and a logical structure. The document entity is the root of the physical structure. It's the entity (on most operating systems, a file) that the parser reads in first, looking for references to additional external entities to read in. The root element is the root of the logical structure; it's the element that contains all the other elements--the document element. The logical structure doesn't care about the physical structure, and the physical structure only cares about logical structure if each component of the physical structure (each entity) wants to qualify as a well-formed entity. >XML 1.0 - "document entity" (Section 4.8). The terms "root node" and >"document root" do not occur in the XML 1.0 Recommendation. The DOM came after the XML spec, so the term "node" doesn't appear in the Rec except for a reference in Appendix E to a classic computer science work's description of finite state algorithms. The XML Rec never set out to define things in terms of nodes. Representations of XML documents that serve certain purposes, like XPath and the DOM, later used the concept of a tree of nodes to describe their representations. >In addition XML 1.0 confuses the issue by using the term "document entity" >to, apparently, refer to both the root of the tree (Section 4.8) and also the >whole serialised document. The XML 1.0 Rec never mentions serialization either. Section 4.8 clearly states that the document entity is the root of the *entity* tree (i.e. the physical structure). Nowhere does the Rec imply that the document entity is the whole document; a document entity can easily have references to other entities that act as components of the document without being part of the document entity. >XML 1.0 further confuses the issue by using the term "root" (with no >qualifier) to refer to the "document element", a child of the "document >entity". The XML 1.0 spec *never* refers to the document element as a child of the document entity. This confuses the physical and logical structure of an XML document. (In XSLT, a document element node is a child of the source tree node, but this is unrelated. Entities in general are meaningless to XSLT because the XML parser that passes an input document to an XSLT processor resolves all entities as it builds the source tree that XSLT actually works on.) Outside of the XML Rec, the XPath Rec says that "XPath models an XML document as a tree of nodes." This is the model that XSLT uses, and while the DOM also talks in terms of trees of nodes, a DOM tree is different. I'm not claiming that it's all very well-organized. Otherwise, there wouldn't have been a need for the Infoset document, and Paul Prescod's talk of groves wouldn't sound so useful. There is plenty of potential for confusion, but if you remember that different tree representations of a document (each with their own root) serve different purposes, it's a big help in keeping better track of what's what. Bob DuCharme www.snee.com/bob <bob@ snee.com> "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
"Roots" of confusion introduced at , AndrewWatt2000 | Thread | Re: "Roots" of confusion introduced, Joe English |
<Announcement> Whitehill <xsl>Compo, Jim Dorey | Date | Re: XSL and HTML layers, Toni Geoly |
Month |