Subject: Re: MSXML Whitespace handling From: Mike Brown <mike@xxxxxxxx> Date: Wed, 2 Aug 2000 00:09:59 -0700 (PDT) |
Andrew Kimball wrote: > > the XSLT spec makes it sound like I > > can expect whitespace characters in my physical document... > > Where does the spec say "physical document"? The spec uses the term "input > tree" I covered this in another post. I am not saying XSLT is not about tree-to-tree transformation. It and the XPath spec do mention in several places that it's not just any tree they're talking about, it's a tree derived from an XML document. The DOM complicates things because it lets people make trees that couldn't come from XML documents. XSLT mentions in at least 2 places that the stylesheet document is the basis of the stylesheet tree. In my opinion there is not enough clarity in the specs regarding these issues. If your stylesheet document contains whitespace in xsl:text elements and the intent of having xsl:text as the default whitespace-preserving element is so that one can in fact preserve whitespace in it, then it is reasonable to believe that I also think this is one of SGML/XML's greatest shortcomings, that character encoding, entities, and the difference between what I like to call the physical document, the abstract document, and the logical contents of the document, are very poorly explained and widely misunderstood concepts -- so much so that people feel like what they are creating in their text editor is "the document" that is only 1 step removed from the trees that are being talked about, when in fact it is an encoded entity that is representing some part of a sequence of certain ISO/IEC 10646-1 coded characters that are in turn, through markup syntax, representing an abstract, hierarchically related collection of data. This confusion is not at all alleviated when the specs toss around phrases like "XML document" and "stylesheet document" and say very little about possible points of confusion such as the interaction between xsl:text and xml:space. You're not at all wrong to say that MSXML and MS DOM are doing things that they are allowed to do, and perhaps even James Clark will come along and say that XSLT's whitespace preservation rules have nothing to do with xml:text and that XSLT document authors should know better than to rely on things like <xsl:text> </xsl:text> because of this. I'm just saying that the absence of such clarity on this issue and those I mentioned above has already led to quite a few stylesheets being authored without consideration of the possiblity that a parser might come along and muck with some carefully placed bytes that represent whitespace characters, or that xsl:text wasn't going to preserve quite as much whitespace as was anticipated by most people's casual understanding of the forces at work. > > ...However inaccurate that may be, it would seem to be > > preferable to preserve whitespace if you know that the DOM will be used as > > the basis of a stylesheet tree. > > MS DOM does not have this information. When the user loads the DOM, they do > not have to declare that it will be used as the basis of a stylesheet tree. > Why should they have to? They may use the same DOM as the basis of > transforms, selections, and custom DOM API tree walks. They may even > perform these operations concurrently if the DOM is free-threaded. They may > have the same DOM running on their server for days, peforming hundreds or > thousands of transforms over its data. > > The point is that the user has control over initial whitespace handling. > XSLT has control only when the user begins a transform. I understand and agree with these points. I was thinking of the masses who are using IE5 to load and transform XML documents without using scripts to call the tools separately. IE5 should certainly know when a document is going to be used as the basis of a stylesheet, and it can invoke the parser appropriately. Sure, it's not technically inappropriate to invoke it without preservation of whitespace, but I still contend that it is preferable so that XSLT authors' expectations, however misguided, about xsl:text behavior will be satisfied. -Mike XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: MSXML Whitespace handling, Andrew Kimball | Thread | RE: MSXML Whitespace handling, Pawson, David |
RE: node-set() function in MSXML?, Pawson, David | Date | Re: saxon? missing, Oliver Becker |
Month |