Subject: Re: "Roots" of confusion introduced at W3C (long) From: AndrewWatt2000@xxxxxxx Date: Fri, 22 Sep 2000 02:45:18 EDT |
Bob, Thanks for your reply. I appreciate its balanced, measured tone. That significantly helps a discussion which undoubtedly has the potential to become heated. I am surprised that there has been no response to this thread from anyone at W3C. If the problems as to what is a "root" are as complex and confusing in W3C documents as I believe they are then I hope the silence is an indication that the issue is being seriously considered. I acknowledge I could have phrased my initial post a little better. I hope I do better this time. :) Let me return to the points you made about the XML 1.0 Recommendation. You commented, referring to the XML 1.0 Recommendation: "I'm not claiming that it's all very well-organized.". There we agree. However, I think the problems with the XML 1.0 Recommendation are much more serious than you presently acknowledge. Given the nature of our present discussion, I suggest the claim in the first sentence of the Recommendation that XML "is completely described in this document" can be seen to be a dubious claim, at best. However, let's move on to the more specific issues you raise. In a message dated 20/09/00 22:37:54 GMT Daylight Time, Robert.DuCharme@xxxxxxxxxx writes: > >Thus, for this foundational concept of the "root" of an XML document we > find > >multiple terms being, apparently, used for the same thing and certain terms > > >being used for more than one thing. > > An XML document can be represented as various kinds of trees, which serve > different purposes, and any tree has a root that serves an important role in > the use of that tree. There is no single concept of an XML document "root" > that serves all purposes; the meaning of "root" depends on the type of tree > representation of the XML document being discussed. If, as you suggest, the meaning of "root" depends on the tree representation of that document why is that not explained in the XML 1.0 Recommendation? If, as I think likely, this foundational concept is not conveyed in the Recommendation is that not a serious deficiency? Is that failure in clear communication in the XML 1.0 Recommendation not lying at the root (sorry for the pun it was just my natural flow :) ) of the consequential problems with the other W3C documents? If XML 1.0 did not clearly define what a "root" was it is unsurprising that the confusion has been transmitted into other documents. Further, if you are correct that the concept of "root" depends on the tree representation, why do the editors of the Recommendation use the term "root" in two distinct usages in Section 2 and Section 2.1 of the Recommendation without adequate explanation? > When switching back and forth between XML-related specs, the difference in > the types of trees being discussed can be confusing. I don't understand them > all, but I do know the XML 1.0 spec pretty well, and it's much more > internally consistent than you make it out to be. Bob, if someone with your experience can honestly acknowledge not to understand them all what hope is there for newer entrants to the field of XML? I think you over-estimate the consistency of the XML 1.0 Recommendation. > The XML 1.0 Rec says that documents have a physical structure and a logical > structure. The document entity is the root of the physical structure. It's > the entity (on most operating systems, a file) that the parser reads in > first, looking for references to additional external entities to read in. > The root element is the root of the logical structure; it's the element that > contains all the other elements--the document element. The logical structure > doesn't care about the physical structure, and the physical structure only > cares about logical structure if each component of the physical structure > (each entity) wants to qualify as a well-formed entity. I have several points here. First, the supposed clear separation between physical and logical structure is more apparent than real. It is claimed in the Recommendation that the physical structures must "nest". But do they? As "physical structures"? They do (or should) nest logically. But physically? The Recommendation goes on to describe the document entity as the root of the "entity tree". Is there a physical "entity tree"? I think not. It is a logical relationship. The "document entity" which Section 2 claims is a "physical structure" is also, so it seems, the root of a logical "entity tree". Or would you wish to claim that a "physical" entity tree exists? The separation of "physical" and "logical" structure in the XML 1.0 Recommendation is much less clear than some might suppose. In passing, I would point out that the nature of the "entity tree" is left undefined. Another illustration of the incompleteness of the document. But the Recommendation later claims the document entity "has no name". In my file system the document entity, as a "physical structure" does have a name - "sample.xml", for example. So in what sense does Section 4.8 refer to the document entity having no name? Section 4.8 is referring to the document entity in another usage - when it is being **logically** combined with any other entities - in all likelihood after it has been transformed into some kind of tree structure. At that time the "document entity" is no longer in the same "physical structure". It is no longer the document entity as "physical structure" - it is another representation. Is it not? The term "document entity" in that context is being used to describe a **logical** representation of the physical structure. So "document entity" is being used, without clear explanation, in the Recommendation in at least two senses. The "physical structure" and some (undefined) transformation/interpretation of that physical structure. Can you see how the Recommendation blurs and confuses the supposed distinction between "logical structure" and "physical structure"? > >XML 1.0 - "document entity" (Section 4.8). The terms "root node" and > >"document root" do not occur in the XML 1.0 Recommendation. > > The DOM came after the XML spec, so the term "node" doesn't appear in the > Rec except for a reference in Appendix E to a classic computer science > work's description of finite state algorithms. I appreciate that the DOM 1 Recommendation came later. It refers back to XML 1.0 in the definition of "root node". As we have established the XML 1.0 Recommendation has no definition or description of a "root node". So the link from DOM 1.0 is to a vacuum, at least as far as that term is concerned. The XML Rec never set out to > define things in terms of nodes. So, can we agree that the XML 1.0 Recommendation does not, as is claimed, "completely describe" XML? Representations of XML documents that serve > certain purposes, like XPath and the DOM, later used the concept of a tree > of nodes to describe their representations. I have no inherent difficulty with that concept being used. My point was that the Recommendations are not adequately linked. Surely we ought to have had "joined up thinking"? :) > >In addition XML 1.0 confuses the issue by using the term "document entity" > >to, apparently, refer to both the root of the tree (Section 4.8) and also > the > >whole serialised document. > > The XML 1.0 Rec never mentions serialization either. Section 4.8 clearly > states that the document entity is the root of the *entity* tree (i.e. the > physical structure). Nowhere does the Rec imply that the document entity is > the whole document; a document entity can easily have references to other > entities that act as components of the document without being part of the > document entity. Please see my comments earlier about section 4.8. It, in my view, subtly but profoundly confuses and undermines any clear distinction between logical and physical structure. Section 4 of the Recommendation states that the document entity "may contain the whole document". I appreciate that it need not do so. However, it does support my comment that the document entity, as a term, is used in a way which seems, in some circumstances, to refer to the whole document. > >XML 1.0 further confuses the issue by using the term "root" (with no > >qualifier) to refer to the "document element", a child of the "document > >entity". > > The XML 1.0 spec *never* refers to the document element as a child of the > document entity. This confuses the physical and logical structure of an XML > document. Bob, as I pointed out earlier, the XML 1.0 Recommendation itself confuses the notions of the physical and logical structure. (In XSLT, a document element node is a child of the source tree > node, but this is unrelated. Entities in general are meaningless to XSLT > because the XML parser that passes an input document to an XSLT processor > resolves all entities as it builds the source tree that XSLT actually works > on.) > > Outside of the XML Rec, the XPath Rec says that "XPath models an XML > document as a tree of nodes." This is the model that XSLT uses, and while > the DOM also talks in terms of trees of nodes, a DOM tree is different. I appreciate that the DOM tree is different. That difference is one of the sources of confusion that I mentioned in my earlier post. I admit I could have missed this but does any W3C document adequately explain those differences or the practical consequences of them? Should some W3C document not actually do so? > I'm not claiming that it's all very well-organized. There we agree. :) But, as you will realise, my concerns go much further. > Otherwise, there > wouldn't have been a need for the Infoset document, and Paul Prescod's talk > of groves wouldn't sound so useful. There is plenty of potential for > confusion, but if you remember that different tree representations of a > document (each with their own root) serve different purposes, it's a big > help in keeping better track of what's what. I will ask the question separately in a parallel post but just what terminology should people use to communicate clearly about what part of an XML document is being referred to? I hope you can begin to see why I have serious concerns within the XML 1.0 Recommendation and how it, on its own, contributes to the confusion in this matter. Of course the differences in terminology between documents adds to the problem. Andrew Watt > Bob DuCharme www.snee.com/bob <bob@ > s XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: xslt and xsl:if, Kay Michael | Thread | RE: "Roots" of confusion introduced, DuCharme, Robert |
xslt and xsl:if, Priya Pinto | Date | Re: "Roots" of confusion introduced, AndrewWatt2000 |
Month |